facebookarchive / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Apache License 2.0
8.42k stars 1.95k forks source link

high memory usage while constructing a C++ caffe2::Predictor instance #1863

Open chxd opened 6 years ago

chxd commented 6 years ago

The C++ API provides a two stage predictor construction process, which is: step 1: load init_net and predict_net. step 2: construct a Predictor instance. I guess that this process takes memory approximately as twice the size of the model. In terms of memory usage this is not efficient, especially when doing it on mobile platforms with large model. Is there any plan to make this process more memory efficient ?

Maratyszcza commented 6 years ago

After constructing Predictor, you can release init_net. This will bring memory usage down to about the size of the model.

chxd commented 6 years ago

Yes. But in my case, the code runs on an Android phone and crashes when constructing Predictor due to insufficient memory available. If Predictor can also be constructed directly from file or stream, it would be a big advantage over the existing constructor.

Maratyszcza commented 6 years ago

Constructing a predictor requires about 2x sizeof(weights) memory. Running inference requires sizeof(weights) + sizeof(activations) memory. Typically sizeof(activations) >> sizeof(weights), so a even if we make constructing predictor cheaper, the model would still run out of memory on inference.