magic-research / piecewise-rectified-flow

PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)
BSD 3-Clause "New" or "Revised" License
449 stars 27 forks source link

Hyperparameters on training SD1.5 #16

Open AndysonYs opened 3 weeks ago

AndysonYs commented 3 weeks ago

Hi. Thanks for your insightful work. Could you share us the hyperparams of training SD1.5 on Laion dataset?

perflow_accelerate_sd.py \ --data_root "???" \ --resolution 512 --dataloader_num_workers 8 --train_batch_size 32 --gradient_accumulation_steps 1 \ --pretrained_model_name_or_path "../assets/public_models/DreamBooth/sd15_eps/DreamShaper_8_pruned" \ --unet_model_path "" \ --pred_type "diff_eps" --loss_type "noise_matching" \ --windows 4 --solving_steps 8 --support_cfg --cfg_sync \ --learning_rate 8e-5 --lr_scheduler "constant" --lr_warmup_steps 500 --use_ema \ --mixed_precision "fp16" \ --output_dir "../exps/sd15ds_perflow_4ddim8_diffeps_cfgsync" \ --validation_steps 100 --inference_steps "8-4" --inference_cfg "7.5-4.5" --save_ckpt_state --checkpointing_steps 1000

AndysonYs commented 3 weeks ago

I've tried this and i got nearly the same result as #17 . It only produce noise after 4k iters. Could you help me on it?

I used the 1M subset of laion/laion2B-en-aesthetic and stable-diffusion-v1-5/stable-diffusion-v1-5. I ran it on 32 GPUs with a total bs of 1024. Here is my script.

accelerate launch \ --main_process_port 16323 \ --num_processes 32 \ --num_cpu_threads_per_process 6 \ ./scripts/perflow_accelerate_sd.py \ --data_root "???" \ --resolution 512 --dataloader_num_workers 8 --train_batch_size 32 --gradient_accumulation_steps 1 \ --pretrained_model_name_or_path "stable-diffusion-v1-5/stable-diffusion-v1-5" \ --unet_model_path "" \ --pred_type "diff_eps" --loss_type "noise_matching" \ --windows 4 --solving_steps 8 --support_cfg --cfg_sync \ --learning_rate 1e-5 --lr_scheduler "constant" --lr_warmup_steps 500 --use_ema \ --mixed_precision "fp16" \ --output_dir "../_expsys/sd15_laion_train" \ --validation_steps 1000 --inference_steps "4-8" --inference_cfg "7.5" --save_ckpt_state --checkpointing_steps 2000 \ --max_train_steps 100000