Closed aaab8b closed 4 months ago
What is the exact training setting? xl/2 uses 4x8gpu to train, while b/2 and s/2 uses 8gpu. We also list the training setting for b/2 and s/2 in our readme.
What is the exact training setting? xl/2 uses 4x8gpu to train, while b/2 and s/2 uses 8gpu. We also list the training setting for b/2 and s/2 in our readme.
look like the readme is different from the setting in your paper of N2 decoder blocks.
Thanks for the reminder! Results from paper is the right setting. I will update the code
Thanks for the reminder! Results from paper is the right setting. I will update the code
still wants to know specific decoder_layer num for B/2, or can you guys provide a public pretrain b/2 model? thank you so much.
First of all, thank you for this wonderful work! I follow the original settings of xl/2 you provided to train mdtv2 b/2 and s/2 however the fid score calculated with imagenet 50k is much higher than the paper provided (for s/2 I got 58 fid score by using cfg when training 960k steps). Are there any settings changed?