ShivamShrirao / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
https://huggingface.co/docs/diffusers
Apache License 2.0
1.89k stars 505 forks source link

Model train is not effective #179

Open HarshkumarDegamadiya opened 1 year ago

HarshkumarDegamadiya commented 1 year ago

Describe the bug

When i trained model with my image the grid was showing worst resul and when i generated image , it is showing me my image only.. No matter what prompt i use , it is showing same potrait image

Reproduction

No response

Logs

No response

System Info

Colab

2blackbar commented 1 year ago

well, something major got changed on diffusers side, it broke entire dreambooth training from what ive seen. today, it was working fine 2 days ago

ShivamShrirao commented 1 year ago

Even with https://github.com/ShivamShrirao/diffusers/pull/178 ?

HarshkumarDegamadiya commented 1 year ago

yes even with the #178 .

2blackbar commented 1 year ago

yes mine just crapped when trying to generate images at 500 steps , we have to wait for a fix, even my saved colab notebooks didnt help cause code is being downloaded from here anyway , thats why i dont like python, one dependency gets "update" and everything else craps down like a house of cards Imo when setting up dependencies people should always use specific version numbers for eveyrthing so it wont break that easily

Steps: 20% 500/2500 [07:16<28:25, 1.17it/s, loss=0.281, lr=1.2e-6]Traceback (most recent call last): File "train_dreambooth.py", line 822, in <module> main(args) File "train_dreambooth.py", line 805, in main save_weights(global_step) File "train_dreambooth.py", line 682, in save_weights text_enc_model = accelerator.unwrap_model(text_encoder, keep_fp32_wrapper=True) TypeError: unwrap_model() got an unexpected keyword argument 'keep_fp32_wrapper' Steps: 20% 500/2500 [07:16<29:07, 1.14it/s, loss=0.281, lr=1.2e-6] Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main args.func(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 837, in launch_command simple_launcher(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

HarshkumarDegamadiya commented 1 year ago

i think it is the error

/usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:105: UserWarning: /usr/lib64-nvidia did not contain libcudart.so as expected! Searching further paths... warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('--listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-2vzb2yx032dzy --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('6000,"kernelManagerProxyHost"'), PosixPath('{"kernelManagerProxyPort"'), PosixPath('"172.28.0.12","jupyterArgs"'), PosixPath('["--ip=172.28.0.12","--transport=ipc"],"debugAdapterMultiplexerPath"'), PosixPath('true}'), PosixPath('"/usr/local/bin/dap_multiplexer","enableLsp"')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//ipykernel.pylab.backend_inline')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')} warn( CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 7.5 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda112.so... /usr/local/lib/python3.8/dist-packages/diffusers/utils/deprecation_utils.py:35: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_ddpm.DDPMScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0. warnings.warn(warning + message, FutureWarning) Downloading: 100% 308/308 [00:00<00:00, 251kB/s] Caching latents: 100% 50/50 [00:11<00:00, 4.38it/s] Steps: 100% 800/800 [11:49<00:00, 1.15it/s, loss=0.278, lr=1e-6]Traceback (most recent call last): File "train_dreambooth.py", line 822, in main(args) File "train_dreambooth.py", line 815, in main save_weights(global_step) File "train_dreambooth.py", line 682, in save_weights text_enc_model = accelerator.unwrap_model(text_encoder, keep_fp32_wrapper=True) TypeError: unwrap_model() got an unexpected keyword argument 'keep_fp32_wrapper' Steps: 100% 800/800 [11:49<00:00, 1.13it/s, loss=0.278, lr=1e-6] Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in sys.exit(main()) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main args.func(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 837, in launch_command simple_launcher(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse', '--output_dir=/content/stable_diffusion_weights/harsh23', '--revision=fp16', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=1337', '--resolution=512', '--train_batch_size=1', '--train_text_encoder', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_accumulation_steps=1', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=50', '--sample_batch_size=4', '--max_train_steps=800', '--save_interval=10000', '--save_sample_prompt=photo of harsh23 person', '--concepts_list=concepts_list.json']' returned non-zero exit status 1.

tchesket commented 1 year ago

Yep I am having the same issue. Was working fine yesterday but today the results are terrible. Getting the same CUDA_SETUP error, then it still trains but it isn't working properly.

KitaharaMugiro commented 1 year ago

The error message says "TypeError: unwrap_model() got an unexpected keyword argument 'keep_fp32_wrapper'" So I removed keep_fp32_wrapper from unwrap_model and it worked. #181

KitaharaMugiro commented 1 year ago

I closed #181 because the issue is solved by updating 'accelerate' library

maxdaneau commented 1 year ago

I imagine this is related to why a derivative dreambooth notebook I'm using is failing. "train_dreambooth.py" is just not saving the final output directory for me anymore (I'm expecting a "0" folder and a "800" folder because I'm using 800 steps, but there's only a "0" directory after it's done running).

enoreyes commented 1 year ago

I am having the same issue, I'm able to train in my colab notebook but the prompted outputs do not look any different from the un-prompted generation of the fine-tuned model. This is after I removed the "keep_fp32_wrapper" parameter.

I am also seeing the bitsandbytes error, so I'm guessing it's related to that as well.

ShivamShrirao commented 1 year ago

@enoreyes it's related to bitsandbytes. Uninstall your mistakes and install the correct version.

enoreyes commented 1 year ago

By "correct version", you mean 0.35.4 correct? I was able to get it working using that version.