SD3 cannot finetunes a better model (hand and face deformation)?

KaiWU5 commented 3 days ago

Describe the bug

I want to finetune sd3 to improve its human generation quality with 3million high-quality human datasets (which has been proven useful on sdxl and other models). But hand and face deformation doesn't improve much after two days of training.

I am using train script

What I have been done so far:

regular training with 3 million data with batch size 2x24(V100) for 2 epochs with lr 5e-6 and adamw optimizer
prodigy optimizer training with same setting
Add q,k RMS norm to each attention layer
only train several blocks

All of my training gives me nearly the same deformation results, where the hands are never normal like human.

Could you some provide more experiments about sd3 training? There seems no easy way to adapt sd3 for human generation

Reproduction

Has described in bug part

Logs

No response

System Info

V100 24GPU, batchsize 2 for each card, 3 million human data with aesthetic score > 4.5

Who can help?

No response

DN6 commented 3 days ago

Hi @KaiWU5 I think this question would be better to ask in the Discussions section.

mliand commented 3 days ago

You can show me your loss training

heart-du commented 2 days ago

I have the same question.

huggingface / diffusers