Dreambooth runs, but doesn't generate class images and doesn

Describe the bug

My Diffusers is running but it just doesn't want to train a model based on my settings and I don't know why.

It also does not generate any class images whatsoever so it seems he doesn't even train the existing model.

Also tried different models, no one worked.

What could it be? What should I change or try?

Reproduction

This is my train.sh file:

export MODEL_NAME="dreamlike-art/dreamlike-diffusion-1.0" export INSTANCE_DIR="training" export CLASS_DIR="classes" export OUTPUT_DIR="output"

accelerate launch train_dreambooth.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR \ --instance_prompt="photo of yface1 person" \ --class_prompt="photo of a person" \ --resolution=512 \ --train_batch_size=1 \ --mixed_precision="fp16" \ --use_8bit_adam \ --gradient_accumulation_steps=1 --gradient_checkpointing \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --num_class_images=200 \ --max_train_steps=800

Logs

(diffusers) babooz@DESKTOP-6IT4DVD:~/github/diffusers/examples/dreambooth$ ./my_training_2.sh
/home/babooz/anaconda3/envs/diffusers/lib/python3.9/site-packages/accelerate/accelerator.py:231: FutureWarning: `logging_dir` is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use `project_dir` instead.
  warnings.warn(
Downloading (…)tokenizer/vocab.json: 100%|█████████████████████████████████████████| 1.06M/1.06M [00:00<00:00, 1.18MB/s]Downloading (…)tokenizer/merges.txt: 100%|████████████████████████████████████████████| 525k/525k [00:00<00:00, 896kB/s]Downloading (…)cial_tokens_map.json: 100%|██████████████████████████████████████████████| 472/472 [00:00<00:00, 105kB/s]Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████████| 806/806 [00:00<00:00, 681kB/s]Downloading (…)_encoder/config.json: 100%|██████████████████████████████████████████████| 592/592 [00:00<00:00, 122kB/s]Downloading (…)"pytorch_model.bin";: 100%|███████████████████████████████████████████| 492M/492M [04:04<00:00, 2.02MB/s]Downloading (…)_pytorch_model.bin";: 100%|███████████████████████████████████████████| 335M/335M [02:45<00:00, 2.02MB/s]Downloading (…)main/vae/config.json: 100%|██████████████████████████████████████████████| 522/522 [00:00<00:00, 182kB/s]Downloading (…)_pytorch_model.bin";: 100%|█████████████████████████████████████████| 3.44G/3.44G [36:44<00:00, 1.56MB/s]Downloading (…)ain/unet/config.json: 100%|██████████████████████████████████████████████| 743/743 [00:00<00:00, 336kB/s]
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
/home/babooz/anaconda3/envs/diffusers/lib/python3.9/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
/home/babooz/anaconda3/envs/diffusers/lib/python3.9/site-packages/diffusers/configuration_utils.py:195: FutureWarning: It is deprecated to pass a pretrained model name or path to `from_config`.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_ddpm.DDPMScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
  deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
Downloading (…)cheduler_config.json: 100%|█████████████████████████████████████████████| 313/313 [00:00<00:00, 8.45kB/s]Caching latents: 100%|██████████████████████████████████████████████████████████████████| 11/11 [04:14<00:00, 23.11s/it] 02/16/2023 14:25:39 - INFO - __main__ - ***** Running training *****
02/16/2023 14:25:39 - INFO - __main__ -   Num examples = 11
02/16/2023 14:25:39 - INFO - __main__ -   Num batches each epoch = 11
02/16/2023 14:25:39 - INFO - __main__ -   Num Epochs = 73
02/16/2023 14:25:39 - INFO - __main__ -   Instantaneous batch size per device = 1
02/16/2023 14:25:39 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 1
02/16/2023 14:25:39 - INFO - __main__ -   Gradient Accumulation steps = 1
02/16/2023 14:25:39 - INFO - __main__ -   Total optimization steps = 800
Downloading (…)ain/model_index.json: 100%|██████████████████████████████████████████████| 543/543 [00:00<00:00, 103kB/s]Downloading (…)nfig-checkpoint.json: 100%|█████████████████████████████████████████████| 209/209 [00:00<00:00, 12.0kB/s]Downloading (…)rocessor_config.json: 100%|█████████████████████████████████████████████| 342/342 [00:00<00:00, 46.6kB/s]Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████| 472/472 [00:00<00:00, 66.8kB/s]Downloading (…)_encoder/config.json: 100%|█████████████████████████████████████████████| 592/592 [00:00<00:00, 72.3kB/s]Downloading (…)_checker/config.json: 100%|██████████████████████████████████████████| 4.56k/4.56k [00:00<00:00, 268kB/s]Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████████| 806/806 [00:00<00:00, 183kB/s]Downloading (…)tokenizer/merges.txt: 100%|████████████████████████████████████████████| 525k/525k [00:01<00:00, 362kB/s]Downloading (…)tokenizer/vocab.json: 100%|██████████████████████████████████████████| 1.06M/1.06M [00:01<00:00, 658kB/s]Downloading (…)"pytorch_model.bin";: 100%|████████████████████████████████████████████| 492M/492M [08:29<00:00, 967kB/s]Downloading (…)"pytorch_model.bin";: 100%|█████████████████████████████████████████| 1.22G/1.22G [16:19<00:00, 1.24MB/s]Fetching 16 files: 100%|████████████████████████████████████████████████████████████████| 16/16 [16:20<00:00, 61.26s/it]/home/babooz/anaconda3/envs/diffusers/lib/python3.9/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
  warnings.warn(
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
[*] Weights saved at output/800
Steps: 100%|█████████████████████████████████████████████████████| 800/800 [19:10<00:00,  1.44s/it, loss=0.149, lr=5e-6]

System Info

diffusers version: 0.13.0.dev0
Platform: Linux-5.15.79.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python version: 3.9.16
PyTorch version (GPU?): 1.13.1+cu117 (True)
Huggingface_hub version: 0.12.0
Transformers version: 4.26.1
Accelerate version: 0.16.0
xFormers version: 0.0.16
Using GPU in script?: Yes, GTX 3090.
Using distributed or parallel set-up in script?: Not sure, is it distributed? My machine is working on "No distributed training" via accelerate config.

ShivamShrirao / diffusers