Closed KKN18 closed 6 months ago
Oh there's no need, you can just use the examples in the readme, similar to:
accelerate launch train_svd.py \
--pretrained_model_name_or_path=/path/to/weight \
--per_gpu_batch_size=1 --gradient_accumulation_steps=1 \
--max_train_steps=50000 \
--width=512 \
--height=320 \
--checkpointing_steps=1000 --checkpoints_total_limit=1 \
--learning_rate=1e-5 --lr_warmup_steps=0 \
--seed=123 \
--mixed_precision="fp16" \
--validation_steps=200
Oh I see. Thank you!
@pixeli99 hello, I try to use multiple-gpus, while the following error shows up:
The command is like:
CUDA_VISIBLE_DEVICES=0,1 accelerate launch ashui_train_svd.py --pretrained_model_name_or_path="models/svd" --pretrain_unet="models/svd_unet_11channels" --gradient_checkpointing --per_gpu_batch_size=1 --gradient_accumulation_steps=2 --max_train_steps=400000 --num_frames=25 --width=512 --height=896 --checkpointing_steps=10000 --checkpoints_total_limit=100 --learning_rate=1e-5 --lr_warmup_steps=0 --seed=42 --mixed_precision="fp16" --validation_steps=200000 --output_dir="./outputs/svd"
while, when only using one gpu like CUDA_VISIBLE_DEVICES=0, the training goes correct. So do I miss something?
First of, thank you for providing SVD training code.
I'm trying to train using multi-GPU, are there any changes I need to make to the code and prompt? In the code, should I uncomment this part and what else do I need to do?