FORTH-ModelBasedTracker / MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance
https://www.youtube.com/watch?v=Jgz1MRq-I-k
Other
856 stars 137 forks source link

the additional window has no image #100

Open Ishihara-Masabumi opened 1 year ago

Ishihara-Masabumi commented 1 year ago

When I run the following command, the additional window has no image as below. ./MocapNET2LiveWebcamDemo --from /dev/video0 --live

Screenshot from 2023-07-21 17-37-37

Ishihara-Masabumi commented 1 year ago

Moreover, the video Almost doesn't move.

AmmarkoV commented 1 year ago

The 3D points output window gets created here however depending on the visibility of the person there are a lot of "if" statements that might make it not appear.

Please remember that this repository currently hosts a body pose estimation method (in the master branch) and a body+hands body pose estimation method ( in the mnet3 branch) and your input image just shows a head, so this is the reason why it does not output something since there is no view of the body.

In regards to the framerate, to achieve the maximum framerate the demo in this repository is programmed to use the GPU for 2D joint estimation and the CPU for 3D (possibly in a multithreaded configuration) so this combination utilizes system resources in the best way possible. If your framerate is low this is most probably caused by having a CPU-only tensorflow library that handles all of the processing in CPU and using the heavier "--openpose" 2D joint estimator.

Trying with the --forth 2D joint estimator : ./MocapNET2LiveWebcamDemo --from /dev/video0 --forth

will work with a decent framerate (albeit with lower quality 2D input that will result in lower quality 3D output) even when all of the computations are conducted on the CPU.