CMU-Perceptual-Computing-Lab / caffe_rtpose

Realtime C++ code for multi-person pose estimation
Other
356 stars 207 forks source link

Options to increase fps #8

Open carstenschwede opened 7 years ago

carstenschwede commented 7 years ago

Are there any options to increase fps besides reducing resolution or adding GPUs? Is it possible to restrict detection to certain joints (e.g. Heads) in order to speed up processing?

ZheC commented 7 years ago

(1) Using MPI model instead of COCO model (2) Using one scale for testing can speed up the processing time. Restricting the detections does not help because the CNN still need to use the same trained model and thus the CNN forwarding processing time is the same.

ZheC commented 7 years ago

Another option is to modify the text.prototxt and reduce the stage number from 6 to 3.

carstenschwede commented 7 years ago

Thanks, I will try both. Any idea of what kind of speedup I could expect?

Warden7 commented 7 years ago

Hi I have a question about the fps is that: I run the rtpose demo on the AWS p2.large instance(with one K80 GPU 24G), however it takes 1.1s to deal a frame. I don't know whether it is because that the k80 gpu has a compute capability of 3.7 lower than that of 6.1 of GTX1080?

gineshidalgo99 commented 7 years ago

These is a preliminary benchmark we have made with the new version we are working on (it will be released in around 1 month). The current version you are using should be around 25-30% slower. Let me know if you are using the same flags. If so, are you using cuDNN 5.1? Older versions of cuDNN might also slow down the program. Thanks!

Current benchmark: https://docs.google.com/spreadsheets/d/1-DynFGvoScvfWDA1P4jDInCkbD4lg0IKOYbXgEq0sK0/edit#gid=0

wangzhangup commented 7 years ago

@Warden7 their compute capabilities K80: 8.73TFLOPS 1080: 9TFLOPS

low fps maybe other reasons

Warden7 commented 7 years ago

Thanks for your warmly analysis. The version of cuDNN is 5.0 and Cuda is 7.5. The key word of GPU information "volatile gpu util" always shows 99%, even though nothing is done on the GPU.Maybe something debug need to be done further.

wangzhangup commented 7 years ago

@Warden7 kill the processes on the GPU

gineshidalgo99 commented 7 years ago

Another way to speed it up is by using the new version (~25% faster): https://github.com/CMU-Perceptual-Computing-Lab/openpose

wangzhangup commented 7 years ago

Reduce the number of feature maps. I modify the stage 3-6 conv layer's output number from 128 to 64. And the result is as good as original version, speed up 25%!

carstenschwede commented 7 years ago

@wangzhangup Thanks, can you try your modification also on the newer version at https://github.com/CMU-Perceptual-Computing-Lab/openpose? Would be interesting to see what overall speedup you are able to get.

carstenschwede commented 7 years ago

@gineshidalgo99 Thanks for the update!

gineshidalgo99 commented 7 years ago

@wangzhangup Thank you so much for your idea! Please, could you email me: gines@cmu.edu to discuss how you did it in more details? We are interested in adding it to our system if that is OK for you!

wangzhangup commented 7 years ago

@gineshidalgo99 OK!

wangzhangup commented 7 years ago

@gineshidalgo99 @carstenschwede this is the speedup model https://drive.google.com/open?id=0B-SxboVJxF-WNmtpWGc5emZrRDg

gineshidalgo99 commented 7 years ago

@wangzhangup The speed-up is impressive, and the accuracy does decrease a bit, but it is a fine for the huge speedup. Do you mind if I add it to the new OpenPose? (I went from 14 to 20 fps on my desktop and from 30 to 22 mAP). Or you can make a pull request with your new prototxt, and I will fix the other details (so you would appear as contributor of OpenPose). Thanks!

https://github.com/CMU-Perceptual-Computing-Lab/openpose

carstenschwede commented 7 years ago

@wangzhangup thanks for the model, impressive speedup!

@gineshidalgo99 is a similar speedup expected for the upcoming "extended" models at OpenPose (e.g. finger tracking)?

gineshidalgo99 commented 7 years ago

@carstenschwede The speed up applies to the body pose, but finger tracking is made on top of it (you need to know the body location to detect the hand), so it will take advantage of it too if this model is used (I did not measure the accuracy impact yet though, I guess I will add both models: 1 for better accuracy and 1 for speed).

carstenschwede commented 7 years ago

I guess I will add both models: 1 for better accuracy and 1 for speed

Sounds perfect. Can't wait to try out the finger detection.

wangzhangup commented 7 years ago

@gineshidalgo99 Could you share your measure code?

gineshidalgo99 commented 7 years ago

It is still quite messy, it uses Matlab and C++, and it is not completely finished. I prefer to wait until I actually finish it properly... sorry!

aakendi commented 5 years ago

@carstenschwede The speed up applies to the body pose, but finger tracking is made on top of it (you need to know the body location to detect the hand), so it will take advantage of it too if this model is used (I did not measure the accuracy impact yet though, I guess I will add both models: 1 for better accuracy and 1 for speed).

I just try finger tracking, with option 640x480, also use tracking 5 but fps just around 10fps. May you give an advice?