CMU-Perceptual-Computing-Lab / caffe_rtpose

Realtime C++ code for multi-person pose estimation
Other
357 stars 207 forks source link

Using BNNs to improve performance #23

Open alex-mcleod opened 7 years ago

alex-mcleod commented 7 years ago

I'm not too much of an expert on this, but is it possible to use binarized neural networks to improve the performance of this project?

shihenw commented 7 years ago

The network itself is nothing more than a CNN, so methods proposed for binarizing CNNs can be applied, for sure.

I guess by performance you mean speed. Binarizing will cause decreased quality of heatmaps and PAFs which is crucial for part-person association. There is probably a critical point for accelerating network that once you go beyond that the association will totally fail.

I haven't tried but, I would say other methods allowing you to gradually adjust where you stand in the speed-quality tradeoff would be better for this task, like this paper: https://arxiv.org/abs/1611.06473.

alex-mcleod commented 7 years ago

Yes I do mean speed. Interesting, thanks for getting back to me. Any plans to try using LCNNs in the future?

ouceduxzk commented 7 years ago

I tried with mobilenet as backbone, which did 5 stages of downsampling, lose more information, eventually does not form good heatmaps and PAFs. For inference, nvidia tensorrt supports int8 inference, but as @shihenw pointed out, probably did not work. Actually, what i found the good speedup is just use two or three stages instead of full refinement and in combination with a less smaller input size, but performance degrades accordingly. Sofar did not find out good solutions to speedup this big cnn network substantially for inference