hollowstrawberry / kohya-colab

Accessible Google Colab notebooks for Stable Diffusion Lora training, based on the work of kohya-ss and Linaqruf
GNU General Public License v3.0
564 stars 79 forks source link

SDXL quality #150

Closed remote3513 closed 1 month ago

remote3513 commented 1 month ago

Hi!

I know this is not really an issue, but I can't seem to fix my personal issue with the SDXL training script and I'm wondering if it's just me...

Whenever I train an SDXL LoRA, the results are terrible.

Do I need to change any of the presets? Depending on the number of images I have, I choose the number of repeats and then usually train for 10 epochs, just like the SD1.5 script.

At the moment, I'm using Colab Pro, but even with the A100 and bf16 enabled, the results are sh*t...

I've tried it with a smaller dataset and a bigger one, I've tried changing the batch size, the number of repeats and epochs, and I've changed the learning rate, network alpha, and dim, but none of this has improved the quality of my LoRA.

Also, I'm reading quite a bit of contradicting information about training. Some people say "low number of repeats, more epochs," while others say "low number of epochs, more repeats."

What are the things I need to consider when setting up my training settings? And what settings do you guys use?

Hopefully, it's not a problem that I opened a new topic for this.

Thanks in advance!

LegendaryNoCon commented 1 month ago

I've been training a ton of loras over the past couple weeks and usually what I'll do is I'll do 300 divided by the number of images in my dataset, and then round up to the nearest whole number. That will be what I use for repeats. I usually do 6 epochs, I'll do more if it's a complicated looking character. Then I change the network dim to 12 and the network alpha to 6, to allow for more data to be stored into the lora. I don't change anything else.

remote3513 commented 1 month ago

I've been training a ton of loras over the past couple weeks and usually what I'll do is I'll do 300 divided by the number of images in my dataset, and then round up to the nearest whole number. That will be what I use for repeats. I usually do 6 epochs, I'll do more if it's a complicated looking character. Then I change the network dim to 12 and the network alpha to 6, to allow for more data to be stored into the lora. I don't change anything else.

So let's say 50 images / 300 = 6 repeats? And then 6 epochs? How about the batch size? Because on the top of the colab page it says: "Colab Premium is recommended. Ideally you'd be changing the runtime to an A100 and use the maximum batch size."

But the max. batch size is 16? So 50 images x 6 repeats x 6 epochs / 16 batch size?

Anyway, im going to try your recommandations! Hopefully i'll get to see a proper SDXL lora this time :)

Cheers mate!

Edit: nevermind the calculation regarding the batch size! It's more like: 50 images x 6 repeats / batch size = 1 epoch. Right...?