08/15/2023 18:41:26 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'dynamic_thresholding_ratio', 'clip_sample_range', 'thresholding', 'variance_type'} was not found in config. Values will be initialized to default values.
wandb: Currently logged in as: mnslarcher. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.15.8
wandb: Run data is saved locally in /home/mnslarcher/ai/sd-xl-hands/wandb/run-20230815_184142-flioaupp
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run wobbly-resonance-5
wandb: ⭐️ View project at https://wandb.ai/mnslarcher/text2image-fine-tune
wandb: 🚀 View run at https://wandb.ai/mnslarcher/text2image-fine-tune/runs/flioaupp
08/15/2023 18:41:46 - INFO - __main__ - ***** Running training *****
08/15/2023 18:41:46 - INFO - __main__ - Num examples = 833
08/15/2023 18:41:46 - INFO - __main__ - Num Epochs = 2
08/15/2023 18:41:46 - INFO - __main__ - Instantaneous batch size per device = 1
08/15/2023 18:41:46 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1
08/15/2023 18:41:46 - INFO - __main__ - Gradient Accumulation steps = 1
08/15/2023 18:41:46 - INFO - __main__ - Total optimization steps = 1666
Steps: 0%| | 0/1666 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/mnslarcher/ai/sd-xl-hands/train_text_to_image_lora_sdxl.py", line 1281, in <module>
main(args)
File "/home/mnslarcher/ai/sd-xl-hands/train_text_to_image_lora_sdxl.py", line 1008, in main
model_input = vae.encode(pixel_values).latent_dist.sample()
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/diffusers/models/autoencoder_kl.py", line 242, in encode
h = self.encoder(x)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/diffusers/models/vae.py", line 110, in forward
sample = self.conv_in(sample)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
wandb: 🚀 View run wobbly-resonance-5 at: https://wandb.ai/mnslarcher/text2image-fine-tune/runs/flioaupp
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20230815_184142-flioaupp/logs
Traceback (most recent call last):
File "/home/mnslarcher/anaconda3/envs/hands/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/accelerate/commands/launch.py", line 979, in launch_command
simple_launcher(args)
File "/home/mnslarcher/anaconda3/envs/hands/lib/python3.10/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/mnslarcher/anaconda3/envs/hands/bin/python', 'train_text_to_image_lora_sdxl.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--dataset_name=lambdalabs/pokemon-blip-captions', '--caption_column=text', '--resolution=1024', '--random_flip', '--train_batch_size=1', '--num_train_epochs=2', '--gradient_accumulation_steps=1', '--checkpointing_steps=500', '--learning_rate=1e-04', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--dataloader_num_workers=0', '--report_to=wandb', '--seed=42', '--output_dir=sd-pokemon-model-lora-sdxl-txt', '--train_text_encoder', '--validation_prompt=cute dragon creature', '--mixed_precision=fp16', '--rank=4']' returned non-zero exit status 1.
System Info
OS Name: Ubuntu 22.04.3 LTS
GPU: NVIDIA GeForce RTX 4090
diffusers-cli env:
- `diffusers` version: 0.19.3
- Platform: Linux-6.2.0-26-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Huggingface_hub version: 0.16.4
- Transformers version: 4.31.0
- Accelerate version: 0.21.0
- xFormers version: not installed
- Using GPU in script?: YES
- Using distributed or parallel set-up in script?: NO
Describe the bug
I'm encountering the same error as described in the closed issue #4478.
I'm currently running the train_text_to_image_lora_sdxl.py script, and the VAE give me the following error:
See "Reproduction", "Logs", and "System Info" for all the details.
Any idea why? Do you need more details or do you want I run other experiments?
Thanks!
Reproduction
Logs
System Info
OS Name: Ubuntu 22.04.3 LTS GPU: NVIDIA GeForce RTX 4090
diffusers-cli env:
enviroment.yml (conda):
default_config.yaml:
Who can help?
@sayak