Closed srikar2097 closed 6 years ago
Hi @srikar2097 , thanks for your questions. Sorry but I'm not sure what caused this. The videos used in demo should take < 10Gb memory, and I was able to use only one GPU. Any chance some other programs might be using GPU memory at the same time?
@chaoyuaw nope. i suspected this and stopped all other programs using GPU. Also, note I have not changed anything from your code. same data (downloaded from the drive link). Are you able to replicate the memory issue?
@chaoyuaw okay i did one more experiment. this time downgraded pytorch from 0.4.1.post2 (current stable) to 0.3.0.post4 (suggested by you) and it runs! See below for stats. Its almost at the brink of memory error but it runs (consumed 11061MB out of 11441MB).
I suspect there are some internal changes within pytorch which makes it take lot more memory for your operations than previous version.
System stats: [0] Tesla K80 | 69'C, 100 %, 159 / 149 W | 11061 / 11441 MB | ec2-user:python/114497(11048M)
Another interesting thing, is your code setup to run multi-gpu? Giving more than 1 GPU in command line has no effect. It always runs only on 1 GPU.
Hi @srikar2097, Glad that you found the issue quickly. Yes, I think the current code doesn't handle multi-GPU correctly. Would you mind considering contributing a PR if you fixed the bug? Thanks :)
@chaoyuaw I haven't yet implemented multi-GPU :) also there was no bug. just downgrade to your suggested pytorch version.
but inorder to replicate your reported results, where do you suggest to begin? What set of 75K Kinetics videos were used? etc.
The video ids I used for train/val/test are available at https://drive.google.com/drive/folders/1MOLuoGDE6lZnmXLJHTUNkJtfY1l2sZSC?usp=sharing
Please send me an email (cywu@cs.utexas.edu) if you need pre-trained models. Thanks!
@chaoyuaw any chance you have the code for the implementation the GOP structure and also code for generating the motion estimation? Thanks!
@chaoyuaw Thank you for sharing your code but for the setting in
train.sh
all hierarchies (0,1 and 2) the code goes intoCUDA error: out of memory
. Little analysis revealed memory required was around 25-30GB. My GPU's have 12GB memory only.How did you get these hierarchy settings to run in GPU's?