Stability-AI / StableCascade

Official Code for Stable Cascade
MIT License
6.53k stars 530 forks source link

The results of LoRA are very bad, where did I go wrong? #66

Open quocanh34 opened 7 months ago

quocanh34 commented 7 months ago

I have trained LoRA several times with faces but it seems the model can't learn anything.

My dataset is as follow:

data.tar 
|- 0000.jpg      
|- 0000.txt ("a photo of a woman [ohwx]")
|- 0001.jpg      
|- 0001.txt ("a photo of a woman [ohwx]")
...

Here is my config:


experiment_id: stage_c_1b_lora_ohwx
checkpoint_path: output
output_path: output
model_version: 1B
dtype: bfloat16

# WandB
# wandb_project: StableCascade
# wandb_entity: wandb_username

# TRAINING PARAMS
lr: 1.0e-6
batch_size: 1
image_size: 1024
multi_aspect_ratio: [1/1, 1/2, 1/3, 2/3, 3/4, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6, 9/16]
grad_accum_steps: 1
updates: 2800
backup_every: 20000
save_every: 200
warmup_updates: 1
# use_fsdp: True -> FSDP doesn't work at the moment for LoRA
use_fsdp: False

# GDF
# adaptive_loss_weight: True

# LoRA specific
module_filters: ['.attn']
rank: 256
train_tokens:
  - ['[ohwx]', '^woman</w>']

# ema_start_iters: 5000
# ema_iters: 100
# ema_beta: 0.9

webdataset_path: file:data/ohwx.tar
effnet_checkpoint_path: models/effnet_encoder.safetensors
previewer_checkpoint_path: models/previewer.safetensors
generator_checkpoint_path: models/stage_c_lite_bf16.safetensors```
wen020 commented 7 months ago

Are you using a single card for training? , when I use a single card for training, I get an error about insufficient storage space. Even though I still have a lot of storage space.

quocanh34 commented 7 months ago

@wen020 yes, I use 4090 to train, may the training requires at least more than 24Gb I think, otherwise it causes oom.

quocanh34 commented 7 months ago

The config of mine works on 4090, just switch to small model like 1B. The only problem I have is the lora result, it is bad.

wen020 commented 7 months ago

I used this data(https://huggingface.co/dome272/stable-cascade/blob/main/fernando.tar) to finetuning the lora,i find the result is ok(the results are below). I immediately started training on my own data. prompt: cinematic photo of a dog [fernando] wearing a space suit dog

wen020 commented 7 months ago

The results of LoRA are very bad with my custom dataset

quocanh34 commented 7 months ago

@wen020 yes, especially on faces :(((

dome272 commented 7 months ago

Hey, we never tried out the 1B model on LoRAs. We just used the 3.6B and I could only give feedback on this. The 1B model is very undertrained.

quocanh34 commented 7 months ago

@dome272 Actually the 3B model trained on LoRAs is still bad, especially on faces. Have you ever tried to train faces with LoRA?

wen020 commented 7 months ago

@quocanh34 I also has trained the lora by 4090. how long do your train 40000 steps? I will cost 3 hour.

wen020 commented 7 months ago

The result training style lora is also bad.

wen020 commented 7 months ago

@quocanh34 how to train a lora in 3.6B model with 4090(24Gb)?

quocanh34 commented 7 months ago

@wen020 I can only train 1B model on 4090, otherwise it will cause oom.