Open nagadit opened 4 years ago
@nagadit Me too, I test the runner in the InferenceWrapper class and got about 380 ms per frame on gtx 1060.
@egorzakharov need your intervention
Hi! @nagadit
First of all, in this pipeline, you are evaluating the full model (initialization + inference) and external cropping function, not just inference. Cropping function consists of a face detector and landmarks detector (face-alignment library), which can be optimized further, we just did not do it in this repository. For a real-time application, you need to train the model from scratch using a face and landmarks detector that works in real-time (like Google's MediaPipe). Note that this issue is common across all methods which utilize keypoints as their pose representation.
You can crop data externally via the infer.InferenceWrapper.preprocess_data
function and call forward with crop_data=False
, then you will only measure initialization + inference speed. Moreover, I would recommend running one basic optimization, which I simply forgot to include in this inference example: module.apply(runner.utils.remove_spectral_norm)
.
Lastly, if you want to measure the speed of the inference generator only, then you need to perform a forward pass of only this network. as mentioned in our article. We additionally speed it up by calling the runner.utils.prepare_for_mobile_inference
function after it has been initialized with adaptive parameters.
Hope this helps!
This closely follows a pipeline that we have developed in our mobile application: a computationally heavy initialization part runs separately in the PyTorch Mobile framework, and then we optimize a personalized inference generator by converting it into ONNX followed by SNPE for the real-time frame-by-frame inference.
By the way, I have pushed the remove_spectral_norm
hack into master.
@egorzakharov could you also share the onnx weights?
@egorzakharov
Thank you very much for such an informative answer, I will try to do something about it.
@ak9250 I will ask my colleagues for approval, but I believe the conversion to ONNX was very simple, ONNX -> SNPE was much trickier.
@egorzakharov Thank you very much! I tested use remove_spectral_norm function and do speed up the process from 380ms to 260ms. But when using "runner.apply(rn_utils.prepare_for_mobile_inference)", I got some problems below:
in prepare_for_mobile_inference gamma = module.weight.data.squeeze().detach().clone() AttributeError: 'NoneType' object has no attribute 'data'
in prepare_for_mobile_inference mod.weight.data = module.weight.data[0].detach().clone() torch.nn.modules.module.ModuleAttributeError: 'AdaptiveConv2d' object has no attribute 'weight'
could you please also push the prepare_for_mobile_inference
to master or give some suggestions?
I tested G_inf on Nvidia 1060, got about 15ms per frame. Thanks for the advice.
The speed of work does not correspond to the declared one, or I am doing something wrong. GPU - 2080TI Help with this pls.