Open chxd opened 6 years ago
After constructing Predictor, you can release init_net. This will bring memory usage down to about the size of the model.
Yes. But in my case, the code runs on an Android phone and crashes when constructing Predictor due to insufficient memory available. If Predictor can also be constructed directly from file or stream, it would be a big advantage over the existing constructor.
Constructing a predictor requires about 2x sizeof(weights)
memory. Running inference requires sizeof(weights) + sizeof(activations)
memory. Typically sizeof(activations) >> sizeof(weights)
, so a even if we make constructing predictor cheaper, the model would still run out of memory on inference.
The C++ API provides a two stage predictor construction process, which is: step 1: load init_net and predict_net. step 2: construct a Predictor instance. I guess that this process takes memory approximately as twice the size of the model. In terms of memory usage this is not efficient, especially when doing it on mobile platforms with large model. Is there any plan to make this process more memory efficient ?