Confusing sample results when training Lora

123sleaf-123 commented 4 months ago

Have you guys seen such strange patterns before?

I have encoutered a great many of such confusing images when sampling in lora training. I run the kohya_ss on my device via wsl-ubuntu2204. I trained my lora on sdxl-base-1.0 model and official vae model. with sample prompt like --w 1024 --h 1024 --d 1337 --l 7.0 --s 24 --n lowres,normal quality, low quality,worst quality,extra digits,fewer digits,wrong hand,bad hands, poorly drawn hands, bad_perspective,error, jpeg artifacts,ugly, duplicate, morbid, mutilated,wrong feet, bad feet,mutated hands,poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs,missing arms, missing legs,fused fingers, long neck, username, watermark,censored,beginner,bad_art,bad arm, bad leg, bad reflection,unfinished, unfinished background,

Moreover, I can only repeat this problem on my device but not any other Linux devices (I tested on several remote severs with same config and they all perform well)

Is a bug of the kohya_ss? The lora-script? The WSL? The corrupt on my base model? or any other else.

seele_xl_v3_e000006_00_20240529200431_1337 seele_xl_v3_e000010_00_20240529202014_1337 seele_xl_v3_e000002_00_20240529194855_1337

123sleaf-123 commented 4 months ago

It seems that it's due to some bugs in v24.1.2~4 since I can repeat this problem in wsl ubuntu2204 and windows11 during v24.1.2~4 when training standard sdxl lora. (It seems that it also exists in v1.0.x)

Now I can simply prevent such a problem by using v23.1.x in wsl ubuntu2204.

But... it's still hard for me to find out where the problem is, that is the code that cause the problem. Loaind images & texts or Sampling?

b-fission commented 4 months ago

The example prompt in your first post has a negative prompt (the long text after --n), but where is the positive prompt? Not sure if that's related to the issue, but it's the first thing that comes to mind.

123sleaf-123 commented 4 months ago

Oh, it's not a big deal. The positive prompt is a description of my target character.

The whole prompt is like this 1girl, _char_name_, _char_feature_, detached collar,blue vest,white shirt,skirt, white thighhighs, solo --w 1024 --h 1024 --l 7.0 --s 24 --n lowres,normal quality, low quality,worst quality,extra digits,fewer digits,wrong hand,bad hands, poorly drawn hands, bad_perspective,error, jpeg artifacts,ugly, duplicate, morbid, mutilated,wrong feet, bad feet,mutated hands,poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs,missing arms, missing legs,fused fingers, long neck, username, watermark,censored,beginner,bad_art,bad arm, bad leg, bad reflection,unfinished, unfinished background,. I have tested prompts like this in various devices with ubuntu and kohya_ss v22.4 or simply lora-scripts. The sample result is correct as expected.

b-fission commented 4 months ago

I did a quick test with your sample prompt and my samples look okay. Trained a lora on sdxl-base-1.0, kohya_ss v24.1.4, Ubuntu-22.04 on WSL2/Windows 10 21H2.

What happens if you shorten your negative prompt, or avoided using it at all?

And what are your WSL version numbers from wsl --version

bmaltais / kohya_ss

Confusing sample results when training Lora #2548