mihirp1998 / AlignProp

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
https://align-prop.github.io/
MIT License
242 stars 8 forks source link

How long dose it take for training the HPS setting? #13

Closed fiona-lxd closed 3 months ago

fiona-lxd commented 11 months ago

I found that training with this code is quick for the aesthetic (about 3hours on one A100). But it seems that it would cost several days for HPS. I started with command CUDA_VISIBLE_DEVICES=0 python main.py --config config/align_prop.py:hps. It cost half of the day for 5 epochs while the total epoch is 200. I know this is caused by the dataset sizes (the number of HPS prompts is 752 while it is 45 for aesthetic). But is there any suggestions on fastening?

mihirp1998 commented 9 months ago

Hi,

Sorry for the late reply, i think training for all 200 epochs might not be needed. As i had access to 4 gpus i had trained for 200 epochs which i believe took about 24-48 hrs.

Increasing the learning rate, reducing the batchsize (so that it reduces the accumulated gradients) and reducing the range of truncated_backprop_minmax from (0,50) to (49,50) would be clear ways of reducing the compute time.

mihirp1998 commented 9 months ago

I have also opensourced the checkpoints for HPS and Aesthetic if helpful.