Open ZekaiGalaxy opened 3 months ago
Hi, I try to reproduce the result of Latte-XL/2 on ffs dataset, but through my observation, the training speed on 8 A100 is quite slow, in comparison with the max_step=100w;
I used exactly the config in the repo, and It takes 1.5 day to generate 4w out of 100w step.
So I wonder
(1) is the provided checkpoint exactly 100w step checkpoint, or is it just a high enough threshold?
(2) Your training speed, how long does it take you to have a good result on ffs
Thank you!
Hi, thanks for your interest.
Hi, thanks for your great work! May I ask how many iterations would it cost to reproduce the result on UCF101.
Hi, thanks for your great work! May I ask how many iterations would it cost to reproduce the result on UCF101.
Hi, thanks for your interest. After several training resumes, I'm not sure how many steps I need to take to achieve good results. Perhaps training to 150k can achieve acceptable results.
Thanks your reply, is the resume operation due to the loss exploding or vanishing?
Thanks your reply, is the resume operation due to the loss exploding or vanishing?
No, I didn't experience any issues with loss exploding or vanishing. Training is stable.
Hi, I try to reproduce the result of Latte-XL/2 on ffs dataset, but through my observation, the training speed on 8 A100 is quite slow, in comparison with the max_step=100w;
I used exactly the config in the repo, and It takes 1.5 day to generate 4w out of 100w step.
So I wonder
(1) is the provided checkpoint exactly 100w step checkpoint, or is it just a high enough threshold?
(2) Your training speed, how long does it take you to have a good result on ffs
Thank you!