about training optimization

guoqincode / Open-AnimateAnyone

Unofficial Implementation of Animate Anyone

2.89k stars 233 forks source link

about training optimization #44

Closed zhangvia closed 8 months ago

zhangvia commented 8 months ago

i try the 8bit adam optimizer, i can train stage one on 40g a100. i think it can help reduce the vram usage, but i don't know if it will decrease the model performance. what dou you think? did you try the 8bit adam?

guoqincode commented 8 months ago

I haven't tried the 8bit optimizer, but I did try mixed precision training and I found it to be less effective compared to full precision training.

zhangvia commented 8 months ago

besides，you scale the lr after the optimizer was created. how does the scaled lr work?

guoqincode commented 8 months ago

I don't want him to work...

hkunzhe commented 7 months ago

I haven't tried the 8bit optimizer, but I did try mixed precision training and I found it to be less effective compared to full precision training.

@guoqincode, Did you try FP16 or BF16?