FORTH-ModelBasedTracker / MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance
https://www.youtube.com/watch?v=Jgz1MRq-I-k
Other
858 stars 137 forks source link

Documentation - Pretrained models #9

Closed AmitMY closed 4 years ago

AmitMY commented 4 years ago

Could you please document your pre-trained models, specifically, what they are expecting to get as an input?

I'm looking for a model that can minimally do body and hands, which I did see in your video: image

But also for a model that can do body+hands+face. Do you have something pretrained like that?

AmmarkoV commented 4 years ago

Hello, as you mentioned, the code-base can understand and parse hands and faces coming from the OpenPose 2D pose estimator ( like the illustration you posted ) however as you have also said and noticed the provided pretrained models in this repository only cover the body..! ( the BVH output has the required joints for face and fingers in order to be future-proof but they are currently not populated in this repository ) .. Unfortunately the next version of MocapNET that will address these parts of the body has not yet been published and so the code provided in this repository only covers what is part of the BMVC 2019 work.. I actually have pretrained models for what you are asking :) but I will only be able to publish them to this repository after first publishing the revised-method in a conference.. Sorry about that, I am just a PhD student and this is how the academic publishing scheme works..! :( Hopefully in a few months they may be available..!

The pre-trained models for the body are an ensemble of simple 4 layer dense neural networks that use SeLU activations and their input is formed using the NSDM structure. The method and models you see in this repository are thoroughly described in the paper.