FORTH-ModelBasedTracker / MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance
https://www.youtube.com/watch?v=Jgz1MRq-I-k
Other
840 stars 135 forks source link

command with IK not working / flipped hip rotation output #35

Closed Crackpack closed 3 years ago

Crackpack commented 3 years ago

If i run following command, ./MocapNET2LiveWebcamDemo --novisualization --from shuffle.webm --ik 0.01 15 40

if gives me following error: Visualization disabled Incorrect number of arguments, 4 required ( @ --ik )..

Also, I ran : ./MocapNET2LiveWebcamDemo --novisualization --from sample.mp4 --openpose

for the video mentioned and sample output......my output file is not quite good as the sample. Is there something extra that needs to be done for refinement? here is my output file: https://drive.google.com/file/d/1_49-f8K3uohBCJjK6ElbuCFfrzkFzRla/view?usp=sharing

I am using Ubuntu on Windows os. Thank you.

AmmarkoV commented 3 years ago

A) For the --ik error

The current master version now requires 3 parameters for the --ik switch, however it appears you have an older snapshot of the code that used to require 4 arguments ( so at your applicationLogic/parseCommandlineOptions.cpp#L209 it also receives the options->spring variable which is internally disregarded) .

Two ways to fix this : 1) If you want to keep your current version just add a zero in the end on your command

./MocapNET2LiveWebcamDemo --novisualization --from shuffle.webm --ik 0.01 15 40 0

2) If you are ok with updating your current version, use the ./update.sh script or manually pull latest master version, recompile and the command you use will work.

B) for the refinement question As already stated in the README

However in order to achieve higher accuracy estimations you are advised to set up a full OpenPose instance and use it to acquire JSON files with 2D detections that can be subsequently converted to 3D BVH files using the MocapNET2CSV binary. They will provide superior accuracy compared to the bundled 2D joint detectors which are provided for faster performance in the live demo, since 2D estimation is the bottleneck of the application.

The bundled 2D joint estimators are homebrew cut-down implementations of the OpenPose and other 2D joint estimation networks, they do not use PAFs and use a smaller input resolution. Long made short "MocapNET2LiveWebcamDemo" is not geared towards offline and accurate processing but rather as a fast and portable interactive demo. If you want higher accuracy you should consider setting up OpenPose and using it to acquire higher quality 2D keypoints that will lead to higher quality 3D poses. I also have added a script that outlines how to process a "sample.mp4" using OpenPose and then through MocapNET. Assuming OpenPose is installed by doing

./dump_and_process_video.sh sample.mp4

it should perform all the required steps

To test the quality difference of the original OpenPose implementation consider downloading this 1.5GB dataset and trying it out using the following commands and assuming you are at the main directory. I acquired this dataset by doing

./dump_and_process_video.sh sven.mp4
mkdir frames
cd frames
wget http://ammar.gr/datasets/sven.mp4-data.tar.gz
tar -xf sven.mp4-data.tar.gz
cd ..

./MocapNET2CSV --from frames/sven.mp4-data/2dJoints_v1.4.csv --show 3  --mt
Crackpack commented 3 years ago

Sir, I tried the 1.5GB(http://ammar.gr/datasets/sven.mp4-data.tar.gz) dataset by using following commands in Ubuntu on windows os and different ubuntu machine:

mkdir frames
cd frames
wget http://ammar.gr/datasets/sven.mp4-data.tar.gz
tar -xf sven.mp4-data.tar.gz
cd ..

./MocapNET2CSV --from frames/sven.mp4-data/2dJoints_v1.4.csv --show 3 --mt

the resulting out.bvh look same as: https://drive.google.com/file/d/1_49-f8K3uohBCJjK6ElbuCFfrzkFzRla/view?usp=sharing

the visualization window shows accurate results but not the out.bvh file.

AmmarkoV commented 3 years ago

I think that the problem you are experiencing is that the camera of the 1.5GB video is not static, if you open the bvh file you sent me and you look at it from a camera with a 0 degrees pitch it might look "incorrect" or like its bending forward

2020-10-19-113353_3840x1080_scrot

However if you look at the armature from an angle

2020-10-19-113357_3840x1080_scrot

The output is exactly the same as the visualization window output. There is no "camera" tracking so every change in the camera is reflected and stored as a change on the skeleton position and orientation.

Your brain does the perspective compensation when you watch the visualization window because of the background, however there are big relative rotation changes happening. Just look at these 2 frames. svenPerspective

If you want to "force alignment" of the bvh to a fixed camera relative to the skeleton so you can eliminate position and rotation changes you can use the GroundTruthDumper tool included :

 ./GroundTruthDumper --from out.bvh --setPositionRotation 0 0 0 0 0 0 --bvh align.bvh

The command above will eliminate the positional and rotational components "nailing" the armature to 0,0,0 so it will be probably easier to see the motions captured.

Another thing I have added is a --dontbend flag

./MocapNET2CSV --from frames/sven.mp4-data/2dJoints_v1.4.csv --show 3 --dontbend --mt

This basically limits the pitch of the skeleton to +-10 degrees, this is just to showcase how you can control the relative change.

I also think that maybe an example from a static camera is less complex and ultimately a better example ..

mkdir frames
cd frames
wget http://ammar.gr/datasets/shuffle.webm-data.tar.gz
tar -xf shuffle.webm-data.tar.gz
cd ..
./MocapNET2CSV --from frames/shuffle.webm-data/2dJoints_v1.4.csv --show 3 --mt
Crackpack commented 3 years ago

Sir, I tried all the mentioned options you provided ...somehow i still feel ...visualization is way smoother than the bvh file. I also noticed that all the transformations were on Euler ...do you think converting instead to Quaternion transformation would give better results? (just a guess). https://drive.google.com/file/d/1E_g8VLqQaNrwNKNrekvp93IgUI0NfREf/view?usp=sharing This is the result of shuffle.webm from : ./MocapNET2CSV --from frames/shuffle.webm-data/2dJoints_v1.4.csv --show 3 --mt

Or something on my configuration is messing up with conversion. IDK.

AmmarkoV commented 3 years ago

It appears you are right..! There is a problem, for some reason on the blender renderer the hip rotation of the skeleton is flipped..

I think the problem is that the Hip joint is CHANNELS 6 Xposition Yposition Zposition Zrotation Yrotation Xrotation instead of Z X Y like the rest of the joints and I am losing a conversion somewhere..!

2020-10-20-113549_3840x1080_scrot

2020-10-20-113726_3840x1080_scrot

Internally 4x4 matrices are used ( thats why the internal representation is working ), you are also right that quaternions are a much better representation (and I already have them implemented ), unfortunately the BVH specification uses euler angles so in order to offer compatible with 3D applications etc I have to use this.. However special care has been taken in the shoulder and hip joints by splitting them in order to avoid gimbal locks.!

AmmarkoV commented 3 years ago

My 3D model math is also correct ( as seen in the OpenGL overlay ) so I am narrowing down on the problem..

2020-10-20-114956_3840x1080_scrot

AmmarkoV commented 3 years ago

Fixed it, I will now commit the fix..! 2020-10-20-120241_3840x1080_scrot

AmmarkoV commented 3 years ago

Please git pull, rebuild and try it out! ( for sure you dont need the --dontbend flag now :) ) Thank you for opening this issue @Crackpack , you have a good eye! :+1:

Crackpack commented 3 years ago

Yes sir...Now I can confirm..its way smoother in bvh also. Sorry for poking too much though. :D

AmmarkoV commented 3 years ago

Hello, found that actually this error is caused by this erroneous ZYX 4x4 transform here -> https://github.com/AmmarkoV/RGBDAcquisition/blob/master/tools/AmMatrix/matrix4x4Tools.c#L446 , so I will have to re-fix this..!

AmmarkoV commented 3 years ago

Hopefully the issue is resolved, the BVH GUI now appears to draw correctly .bvh files with ZYX rotations like the ones in https://github.com/ubisoft/ubisoft-laforge-animation-dataset ..

screen-2021-02-05-17-08-22