FORTH-ModelBasedTracker / MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance
https://www.youtube.com/watch?v=Jgz1MRq-I-k
Other
840 stars 135 forks source link

Looking forward to windows version! #58

Closed youngallien closed 2 years ago

youngallien commented 3 years ago

Hi, thanks for your wonderful project. RGBDAcquisition seems to be only work on linux. I've tried to transfer your code to windows, the "JointEstimator2D" can work normally, but the function "std::vector runMocapNET2( struct MocapNET2 mnet, struct skeletonSerialized input, int doLowerbody, int doHands, int doFace, int doGestureDetection, unsigned int useInverseKinematics, int doOutputFiltering ) and "improveBVHFrameUsingInverseKinematics"

What should I do if I only want to get the rotation and displacement of the joints?

AmmarkoV commented 3 years ago

Hello, to the best of my knowledge ( users have reported this on issues ) you can already compile and use the code of this repository on windows using the Win10 linux subsystem.

The main code dependencies ( OpenCV / Tensorflow ) are cross-platform, the C++ code is also pretty vanilla..

However there are some stuff in the RGBDAcquisition code like SSE2 optimized math https://github.com/AmmarkoV/RGBDAcquisition/blob/master/tools/AmMatrix/matrix4x4Tools.c#L986

Multi-threading code using PThreads https://github.com/AmmarkoV/RGBDAcquisition/tree/master/tools/PThreadWorkerPool

My Codecs library https://github.com/AmmarkoV/RGBDAcquisition/tree/master/tools/Codecs

among others that need work to get ported.

Unfortunately due to not having a Windows machine, not having any incentive to make a windows version ( I am developing this project for my PhD and the important thing is the method not the implementation ) , having barely enough time to maintain this repository in parallel to my main development snapshot I don't plan on making a windows version any time soon :( sorry about that..!

In the next version I am planning to decouple the front-end application from the pose server ( which will be a network server ) so maybe with this architecture one will be able to just host the 2D->3D "server" and use a client from windows that will be much easier to compile and maintain in both platforms, I also have no Windows machine

One way to get the data you want could potentially be to use the ROS wrapper of the project https://github.com/FORTH-ModelBasedTracker/mocapnet_rosnode that allows you to grab rotations/positions of the 3D points using the ROS TF tree, ( you should be able to use ROS on windows connected to a rosmaster hosted on linux )

The "best"/easiest way to get the rotation and displacement of the joints is to use this code to spit a .BVH file

and then write your own BVH parser using as reference my C++ implementation https://github.com/AmmarkoV/RGBDAcquisition/tree/master/opengl_acquisition_shared_library/opengl_depth_and_color_renderer/src/Library/MotionCaptureLoader

or another already cross-platform bvh player like BVHPlay https://sites.google.com/a/cgspeed.com/cgspeed/bvhplay/bvhplay-downloads-page