Open Deathawaits4 opened 5 months ago
Can your provide the following things:
Lets see if we can figure out what went wrong. Because just based on your description I can't really give you any information. Because to an untrained eye undercooked (too little training) and overcooked (overfit from too much training) look largely the same.
Not sure if this is problem but, cant really count on final checkpoint, could be overtrained. Save as it goes every number of epochs you set to save, try to set it to save checkpoint around where you are getting the good samples
If thats not the problem then make sure you are actually sampling at 1024x1024 --w 1024 --h 1024 in sampling box if your doing sdxl.
I am having the same issue, it sometimes outputs good images but mostly the images are completely garbled.
here are the training settings I used
kohya_ss/sd-scripts/sdxl_train_network.py" --bucket_no_upscale --bucket_reso_steps=32 --cache_latents --cache_latents_to_disk --caption_extension=".txt" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --gradient_checkpointing --keep_tokens="1" --learning_rate="1.0" --logging_dir="C:\Ai\me\processed\New folder\log1" --lr_scheduler="constant" --lr_scheduler_num_cycles="1" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution="1024,1024" --max_token_length=150 --max_train_steps="1250" --min_snr_gamma=5 --mixed_precision="bf16" --network_alpha="128" --network_dim=128 --network_module=networks.lora --no_half_vae --optimizer_args weight_decay=0.4 decouple=True d0=0.00000033 use_bias_correction=True safeguard_warmup=True --optimizer_type="Prodigy" --output_dir="Z:\output-model-kohya" --output_name="xxx-XL-v0.9992-animagine" --pretrained_model_name_or_path="J:/ai/stable-diffusion-webui/models/Stable-diffusion/animagine-xl-3.1.safetensors" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="bf16" --scale_weight_norms="4" --seed="9752758" --text_encoder_lr=1.0 --train_batch_size="3" --training_comment="trigger: the queen of heart 1a" --train_data_dir="C:\Ai\me\processed\New folder\images_noidentifier" --unet_lr=1.0 --xformers --sample_sampler=euler_a --sample_prompts="Z:\output-model-kohya\sample\prompt.txt" --sample_every_n_epochs=1
I am having the same issue, it sometimes outputs good images but mostly the images are completely garbled.
here are the training settings I used
kohya_ss/sd-scripts/sdxl_train_network.py" --bucket_no_upscale --bucket_reso_steps=32 --cache_latents --cache_latents_to_disk --caption_extension=".txt" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --gradient_checkpointing --keep_tokens="1" --learning_rate="1.0" --logging_dir="C:\Ai\me\processed\New folder\log1" --lr_scheduler="constant" --lr_scheduler_num_cycles="1" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution="1024,1024" --max_token_length=150 --max_train_steps="1250" --min_snr_gamma=5 --mixed_precision="bf16" --network_alpha="128" --network_dim=128 --network_module=networks.lora --no_half_vae --optimizer_args weight_decay=0.4 decouple=True d0=0.00000033 use_bias_correction=True safeguard_warmup=True --optimizer_type="Prodigy" --output_dir="Z:\output-model-kohya" --output_name="xxx-XL-v0.9992-animagine" --pretrained_model_name_or_path="J:/ai/stable-diffusion-webui/models/Stable-diffusion/animagine-xl-3.1.safetensors" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="bf16" --scale_weight_norms="4" --seed="9752758" --text_encoder_lr=1.0 --train_batch_size="3" --training_comment="trigger: the queen of heart 1a" --train_data_dir="C:\Ai\me\processed\New folder\images_noidentifier" --unet_lr=1.0 --xformers --sample_sampler=euler_a --sample_prompts="Z:\output-model-kohya\sample\prompt.txt" --sample_every_n_epochs=1
Try changing keep_tokens
in to 2. And what are the Optimizer extra arguments
you are giving to prodigy. Prodigy doesn't work straight out of box. Also try training in FP16 since BG16 can sometimes just make garbage. Another thing to keep in mind that all models are not created equal for training. Models with more merging and finetuning don't work as well. So if you can't get that specific one to work, try going back to plain old SDXL and see if issue persist.
Also drop your Dimensions and Alpha WAY down to begin with. I start my training always from 4 and double it as needed. Currently it is almost impossible to tell whether something is wrong with training process or dataset or possibly both. Because the LORA can take in so much information.
If you want further help this list I made still applies:
To get an idea of the issue, we need to see the issue. LoRA settings need to be altered based on what you are training. I do 4-5 versions before I get an idea what it is I want and need to do, then usually the 2nd of the final attempts get me what I want. Generally if I can't make it in 5-6 attempts, I scrap it and start from the begging by figuring out what is wrong with the dataset. But there are ALWAYS signs in the outputs which gives an idea about what went wrong.
Hello, i have an issue that when training a checkpoint i get very very very good samples. They look exactly how they should. Yet the finished checkpoint with the same sampler and same settings, gives completely garbled mess that doesnt even remotely resemble what i trained.
Why is this happening?