bmaltais / kohya_ss

Apache License 2.0
9.69k stars 1.25k forks source link

Support for LORA Training #2628

Open UndergroundBeauty opened 4 months ago

UndergroundBeauty commented 4 months ago

Hello,

I was hoping to find some help here. I am running this gui but my training is taking too long. Something seems broken. I am training 13 images, and it is taking two hours and I have a 3060 Nividia card. The process is not maxing out GPU ram (running at about 12GB). My settings in my config_file all seem correct as well (let me know if you would like me to include that here). Anyways, I was getting help from someone until they ran out of ideas, said I could get more help here. New to this, so please let me know how best I can provide relevant information for troubleshooting.

Thank you

maybleMyers commented 4 months ago

Sometimes training takes 4 months.

UndergroundBeauty commented 4 months ago

The person who has been helping me so far assured me that something with kohya was wrong, perhaps on install? Because the current training project given my hardware does not align with how long its taking.

b-fission commented 4 months ago

What speeds (s/it) are you seeing while training? And pasting your config file would be helpful to figure out what's going on.

UndergroundBeauty commented 4 months ago

So, I am getting aounrd 4-11 with maybe an average of 5 s/it. avr_loss= 0.1 or so. ponyrealism_fastLora_name.json

b-fission commented 4 months ago

Using 50 epochs feels rather high to me, even for 13 images. What repeat count are you using on your image folder? Does your lora produce decent results when you use it?

I'd personally use 20 epochs (or less) and see if prodigy can keep up with it. By the way, it looks like your config isn't using prodigy's recommended learning rate of 1.0 for the unet and text encoder. The option that's simply labeled "Learning rate" will be overridden by the other two learning rates if they're not zero.

UndergroundBeauty commented 4 months ago

the folder name looks like this "1_firstname woman". The Lora produce terrible results. I am trying to make a lora that does a Face detailer that will change a face to a more consistent look. While testing, I produce a face initially, then run it through the face detailer with almost identical prompts. but these lora turn out terrible.

I tried using 20 epochs, and switched both those values to 1. It is producing models at a expectable pace, but quality is still very poor.

b-fission commented 4 months ago

So you're saying the lora doesn't even produce faces which resemble the originals (without even using ADetailer)?

What type of images are you training with? Are they face/closeup shots, portrait shots, full body shots, or a mix of those? And since you're training on a face, it should okay to set Text Encoder learning rate to 0, which can slightly improve the training speeds.

ADetailer's default inpaint settings would prefer closeup shots unless you adjust some options. For example, if your training images were mainly portraits/full shots, what if you increased the "Inpaint only masked padding, pixels" option?

UndergroundBeauty commented 4 months ago

So, before FaceDetailer (using ComfyUI), the images have resemblance (maybe?) to a tiny degree.

I am training with 13 close up portrait shots.

I can try turning Text encoder to 0. What should I set the inpaint only masked to?

b-fission commented 4 months ago

So, before FaceDetailer (using ComfyUI), the images have resemblance (maybe?) to a tiny degree.

What happens when you generate an image with the prompt firstname woman and no extra words and decent lora strength, is it still a bad result?

What image resolution are you generating at?

I'm wondering now if you'll have better luck training your lora on regular SDXL (or SD1.5) instead of a Pony model. That would mean you would generate the initial image using Pony (no lora), then have ADetailer use the prompt firstname woman <lora:firstname:1> with an SDXL checkpoint to inpaint the face.

I can try turning Text encoder to 0. What should I set the inpaint only masked to?

Since you say your lora is mainly trained on closeups and portraits, I think the default (32) would be okay as-is. But if you want to experiment, try increasing the "inpaint only masked padding" to around 100-200.

UndergroundBeauty commented 4 months ago

So, with Pony, it seems to prompts only work if you begin it with "score_9, score_8_up, score_7_up, BREAK, 8k," So my prompts look like "score_9, score_8_up, score_7_up, BREAK, 8k, firstname woman". I always have my lora strength set one I believe. Still bad yes.

1024x1024

I can try training and SDXL model. I am simply trying to achieve a powerful faceswap technique in comfyui. I have no need for it to be generate before facedetailer if the lora can be strong enough to change the base face entirely

UndergroundBeauty commented 4 months ago

Wow ok, so I tried to make a lora made from sdxl_base and only applied it to the facedetailer with the respective checkpoint and it really messed up. Replace the face area of the original image with a totally different face that looks nothing like the trained image.

I am also using face_yolo8m.pt for the bbox_detector and it replaces the face in a terrible way. Looks like a box is just put around the old face (and the new face still lookings nothing like training data).

smh