chaoyuaw / pytorch-coviar

Compressed Video Action Recognition
https://www.cs.utexas.edu/~cywu/projects/coviar/
GNU Lesser General Public License v2.1
502 stars 126 forks source link

cuda out of memory #29

Open JGyoung33 opened 6 years ago

JGyoung33 commented 6 years ago

Hi When I set batch size as 40 for hmdb51 or 80 for ucf101 in terms of iframe, the training would stop showing "cuda: out of memory". I have to reduce batch size to 20 or 30, but the training process is very slow, and it needs 5 or more days for training ucf101. now I use 4 titan xp to train the model, have you seen this same bug or whether could I change some configurations?

Thank you!

chaoyuaw commented 6 years ago

Hi, I also used 4 TITAN Xp, which have 12G memory each, to train ucf101 and I didn't see the issue. Are you sharing the GPUs with other programs? Do you have any estimation about how much you're exceeding the memory constraint? Thanks!

JGyoung33 commented 6 years ago

Hi, I also used 4 TITAN Xp, which have 12G memory each, to train ucf101 and I didn't see the issue. Are you sharing the GPUs with other programs? Do you have any estimation about how much you're exceeding the memory constraint? Thanks!

Thank you ! I have figured out. It's a problem about cuda Memory.