Closed d-lay closed 3 years ago
Hello d-lay,
Thanks for your question!
In your particular task, the processing rate is more important than the prediction accuracy. My suggestion is to start with a very primitive model (e.g. a model with a CNN that predicts keypoint heatmaps, and the PnP solver from OpenCV). This will likely give a very fast prediction speed and a relatively lower accuracy. You may want to try and see if this is satisfactory enough.
If the speed is not satisfactory, we can further simplify the network architecture. We may also want to try if we can optimize the implementation of the PnP solver.
If the accuracy is not satisfactory, we can gradually add some new ideas from recent papers in 6D pose estimation. Maybe we can try the keypoint voting scheme in PVNet first, and then the hybrid intermediate representations in our HybridPose.
I hope this helps!
Hi chensong1995, thank you for your insight! This actually seems to be the approach I might take. Do you by chance have a specific model in mind as a good starting point?
MobileNet seems to be a popular network architecture on mobile devices. It can be a good starting point. Represent the keypoints as heatmap peaks with Gaussian decay. Use cv2.solvePnP() for pose regression. Let's see how it goes. I hope you can get good results!
Again thank you for the advice!
Hi, I was wondering if it might be possible to run your 6D Pose Estimator even on the limited hardware of a smartphone. Before diving too deep into it, I wanted to get the opinion of more experienced people in the field.
If you think it is possible, is a rough estimation for fps possible (Lets assume it would run on an Adreno 650 mobile GPU)?
If you think it is not possible could you briefly elaborate on the reasons.
Any insights are appreciated. Thank you very much!