gabeur / mmt

Multi-Modal Transformer for Video Retrieval
http://thoth.inrialpes.fr/research/MMT/
Apache License 2.0
258 stars 41 forks source link

How to speed up the training process? #17

Closed sqiangcao99 closed 3 years ago

sqiangcao99 commented 3 years ago

Hi. Thank you for generously sharing your work. When I trained the model on MSRVTT with 1 V100, I found the GPU-Util cannot reach 100%(about 60%). Do you have some tips? Thank you.

gabeur commented 3 years ago

In order to speed up the training, you could increase the batch size until filling up the whole GPU memory. However, I found out that a larger batch size was causing more overfitting, so I only recommend it for training on a large dataset (like HowTo100M). Probably it is possible to mitigate the overfitting problem with more regularisation or a different learning rate decay, but I did not experiment with that, sorry.

sqiangcao99 commented 3 years ago

Thank you for your quick response.