Closed alexlyzhov closed 7 years ago
Hi nikkou,
the straightforward way to use less memory is to feed in smaller images, but that is probably not what you want. It's possible to reduce the raw memory requirements of the larger networks, but not very easy.
You could run the network layer-by-layer instead of the full network at once. This will require some familiarity with Caffe, and some script coding (setup tiny one-layer network, read this layer's weights, feed input, save output to disk, continue with next layer.. rinse and repeat). It will also be slower.
You could recompile Caffe to use a lower-precision float representation (e.g. FP16). Accuracy of the results will suffer, but possibly not by much. Note that this is certainly more work than the first approach. It might not save enough memory for the full FlowNet2.
Best, Nikolaus
I am still able to run FlowNet2-CS but FlowNet2-CSS and FlowNet2 fail with "Check failed: error == cudaSuccess (2 vs. 0) out of memory". When I query free memory with cudaMemGetInfo() I can see 950MB free before I run run-flownet.py and FlowNet2 weights occupy only 650MB.
Can it still be possible to fit the model into memory with some tricks?