Stability-AI / StableCascade

Official Code for Stable Cascade
MIT License
6.44k stars 518 forks source link

lora inference gets dismatch size #83

Closed MyLifeisGettingBetter closed 4 months ago

MyLifeisGettingBetter commented 4 months ago

I trained a lora on single RTX 4090 (24GB) with config setting below:

experiment_id: lora_liang_V1 checkpoint_path: /data/py_project/StableCascade/models output_path: /data/py_project/StableCascade/models model_version: 1B

WandB

wandb_project: StableCascade wandb_entity: wandb_username

TRAINING PARAMS

lr: 1.0e-4 batch_size: 4 image_size: 768 multi_aspect_ratio: [1/1, 1/2, 1/3, 2/3, 3/4, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6, 9/16] grad_accum_steps: 4 updates: 10000 backup_every: 1000 save_every: 100 warmup_updates: 1

use_fsdp: True -> FSDP doesn't work at the moment for LoRA

use_fsdp: False

GDF

adaptive_loss_weight: True

LoRA specific

module_filters: ['.attn'] rank: 4 train_tokens:

- ['^snail', null] # token starts with "snail" -> "snail" & "snails", don't need to be reinitialized

ema_start_iters: 5000

ema_iters: 100

ema_beta: 0.9

webdataset_path: file:/data/py_project/StableCascade/liang.tar

effnet_checkpoint_path: models/effnet_encoder.safetensors previewer_checkpoint_path: models/previewer.safetensors generator_checkpoint_path: models/stage_c_lite_bf16.safetensors

I got lora directory below

image

I've changed model_version to 1B in configs/inference/lora_c_3b.yaml, I got error like image

MyLifeisGettingBetter commented 4 months ago

problem solved!!!