wei-tim / YOWO

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Other
846 stars 158 forks source link

Is it trained on a single TITAN XP? #71

Closed tomatowithpotato closed 3 years ago

tomatowithpotato commented 3 years ago

I use a single RTX3060 12GB to train, use default settings but gpu memory is not enough as I konw, TITAN XP is also 12GB

So I am curious what the actual parameters are?

tomatowithpotato commented 3 years ago

I noticed that the batchsize in the new version has been adjusted to 10, lower than 12 in old version so the problem no longer exists

okankop commented 3 years ago

You can use the max batch size that fits to your GPU. The new version of the code leverages 'gradient accumulation', which accumulates gradients reaching to size 128 and then back propagates. This is critical especially for AVA dataset training, which has more classes.

tomatowithpotato commented 3 years ago

You can use the max batch size that fits to your GPU. The new version of the code leverages 'gradient accumulation', which accumulates gradients reaching to size 128 and then back propagates. This is critical especially for AVA dataset training, which has more classes.

thanks for your reply!!! gradient accumulation is a good way to effectively reduce memory usage I also plan to try ‘float16’ training, hope it works