Closed ZekaiGalaxy closed 1 month ago
Hi :)! First thank you for your excellent work!
I am trying to reproduce the results of Latte, and I wonder the total batch size for each dataset (local_batch_size * num_gpus), can you share more information on the experiment setups?
I tried the 1e-r lr with 32 total batchsize with the small version Latte-S, but can't generate good results. And so I wonder is the batch size / model size highly relevant to the final results? Thank you!
Hi, thanks for your interest. I think the learning rate you're using is a little too high. You can refer to Figure 6 of the paper.
oh I am sorry for the typo, I use 1e-4 as written in the config
oh I am sorry for the typo, I use 1e-4 as written in the config
My local_batch_size
is set to 5. You can see the performance of Latte-S in Figure 6.
oh I am sorry for the typo, I use 1e-4 as written in the config
My
local_batch_size
is set to 5. You can see the performance of Latte-S in Figure 6.
I want to confirm whether all the unconditional models are trained on 1 GPU with local_batch_set=5, and max_train_steps=1000000. Now I am trying to reproduce the paper results and not sure how many GPU to use.
oh I am sorry for the typo, I use 1e-4 as written in the config
My
local_batch_size
is set to 5. You can see the performance of Latte-S in Figure 6.I want to confirm whether all the unconditional models are trained on 1 GPU with local_batch_set=5, and max_train_steps=1000000. Now I am trying to reproduce the paper results and not sure how many GPU to use.
Hi, I trained all the unconditional models on 8 GPUs with local_batch_set=5.
Hi There! 👋
This issue has been marked as stale due to inactivity for 14 days.
We would like to inquire if you still have the same problem or if it has been resolved.
If you need further assistance, please feel free to respond to this comment within the next 7 days. Otherwise, the issue will be automatically closed.
We appreciate your understanding and would like to express our gratitude for your contribution to Latte. Thank you for your support. 🙏
Hi :)! First thank you for your excellent work!
I am trying to reproduce the results of Latte, and I wonder the total batch size for each dataset (local_batch_size * num_gpus), can you share more information on the experiment setups?
I tried the 1e-r lr with 32 total batchsize with the small version Latte-S, but can't generate good results. And so I wonder is the batch size / model size highly relevant to the final results? Thank you!