Closed danielaixer closed 10 months ago
Okay, confirmed, "Mixed precision" set to "no" works. Regarding "accelerate config", I think it doesn't really matter which mixed precision you choose.
Also, do NOT use AdamW8bit as optimizer (bitandbytes issue), use AdamW instead, and set "CrossAttention" to "none" (xFormers issue).
However, I still can't generate sample images nor captions with kohya_ss, but those issues are secondary.
I'm on Ubuntu 22.04, with 7900XTX GPU, ROCm5.6 and Mesa drivers. I can generate images using GPU via stable-diffusion-webui.
I have installed koyha_ss with these commands:
And I start the GUI with:
I'm trying to train a LoRA model using the optimizer AdamW and with CrossAttention set to none. These parameters help me avoid bitandbytes and xFormers errors, but just when it seems it's working and getting to the optimization steps I get this error:
And at the end of the terminal this:
subprocess.CalledProcessError: Command '['/home/username/kohya_ss/kohya_ss/venv/bin/python', './train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=/home/username/kohya_ss/kohya_ss/datasets/Something', '--resolution=512,512', '--output_dir=/home/username/kohya_ss/kohya_ss/models/Lora/Custom', '--network_alpha=48', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=96', '--output_name=Something2', '--lr_scheduler_num_cycles=1', '--no_half_vae', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=20', '--train_batch_size=4', '--max_train_steps=200', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--optimizer_type=AdamW', '--max_grad_norm=1', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--save_every_n_steps=500', '--bucket_no_upscale', '--noise_offset=0.0', '--sample_sampler=euler', '--sample_prompts=/home/username/kohya_ss/kohya_ss/models/Lora/Custom/sample/prompt.txt', '--sample_every_n_steps=25']' returned non-zero exit status 1.
Based on similar errors mentioning 'Half', I'm pretty sure we need que equivalent of using
--precision full --no-half
when launching AUTOMATIC1111/stable-diffusion-webui.The method shown here doesn't improve the situation for me: https://github.com/bmaltais/kohya_ss/issues/1484 Including installing PyTorch ROCm5.7:
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.7
Edit: When running "accelerate config", choosing "no" for the question "Do you wish to use FP16 or BF16 (mixed precision)?" didn't help.
Edit: Setting "Mixed precision" to "no" seems to be working, I will update one I confirm I can do a complete LoRA training.