FORTH-ModelBasedTracker / MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance
https://www.youtube.com/watch?v=Jgz1MRq-I-k
Other
834 stars 135 forks source link

HCD module regression / Broken, Symmetric Incorrect arm placement behind torso on MNET3 #116

Open AmmarkoV opened 8 months ago

AmmarkoV commented 8 months ago

It works now but the output joints from the bvh file is messed up, as you can see in the image below

Screenshot from 2023-12-10 12-06-47

Originally posted by @justinjohn0306 and @ArEnSc in https://github.com/FORTH-ModelBasedTracker/MocapNET/issues/115#issuecomment-1848873900

AmmarkoV commented 8 months ago

The issue of inverted or broken arms bending the wrong way in its core stems from the mathematical ambiguity of regressing a 3D structure from 2D keypoints.

MocapNET receives 2D points as input and regresses a 3D BVH skeleton that corresponds to the 2D observations.

Mathematically however it is almost impossible for the method to discern e.g. the following 4 poses
symmetries This is because although the 3D orientations of the arms are different the 2D input that leads to these 3D outputs is essentially the same. This is an issue with any 2D to 3D pose estimation method but also MNET3. To combat this issue a number of steps are taken :

1) Since the times of MNET2, I isolated a large number of corrupt files from the CMU/BVH Mocap dataset ( https://drive.google.com/file/d/1Zt-MycqhMylfBUqgmW9sLBclNNxoNGqV/view?usp=drive_link see needfix subfolder ) after manually going through all files. These data samples where corrupting training by teaching the network to bend arms backwards 2) I introduced a penalization term in the HCD fine tuning ( https://github.com/AmmarkoV/RGBDAcquisition/blob/master/opengl_acquisition_shared_library/opengl_depth_and_color_renderer/src/Library/MotionCaptureLoader/ik/bvh_inverseKinematics.c#L456 ) that is controlled by supplying --penalizeSymmetriesHeuristic as a commandline parameter, this implementation is a little shoddy and slow btw 3) While creating the training data supplying the --filterOccludedJoints flag to the GroundTruthGenerator utility filters out poses with the hands behind the torso thus improving this issue
https://github.com/AmmarkoV/RGBDAcquisition/blob/master/opengl_acquisition_shared_library/opengl_depth_and_color_renderer/src/Library/MotionCaptureLoader/export/bvh_export.c#L123

The particular dataset used to work ok, but it seems some new optimizations done in the HCD module (for MNET4) now inversely affect MNET3 return solutions with incorrect arm orientations.

I know that the cause is the HCD module and not the neural network since by disabling the IK module with the --noik command

./MocapNET2CSV --from con0014/2dJoints_v1.4.csv --noik --mt --show 3  --hands --label s10-p01-c0014-f --seriallength 4 --gdpr --penalizeSymmetriesHeuristic 

as you will see just taking the input from the neural network hands are bending forward..

TLDR: Due to changes in the Fine tuning module for MNET4 I seems to have broken it in MNET3 Please consider using this version : https://colab.research.google.com/github/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/mocapnet4.ipynb Clicking the first, third, fifth and sixth play button you will be able to successfully run the same dataset and recover the BVH file which looks like this.

https://github.com/FORTH-ModelBasedTracker/MocapNET/assets/97630/78916d63-fa43-4591-ac57-c14ec3e6b7e2 screen175

out.bvh.zip

Sorry about the inconvenience, maintaining the same HCD module for 4 different branches (out of which the MNET4 is written in Python and has another wrapper) is a nightmare, however I will do my best to resolve this.

Looking forward for comments or "better" workarounds in the case you can help with this. The "offending" code is in MocapNET/dependencies/RGBDAcquisition/opengl_acquisition_shared_library/opengl_depth_and_color_renderer/src/Library/MotionCaptureLoader/ik/

justinjohn0306 commented 8 months ago

As you can see with MNET4, the output joints are still messed up. I did a test with the shuffle.webm video, and here's what I've got (check the attachments). Also, it seems like the hip/torso area is locked, so it's having a hard time trying to animate the whole thing freely.

test1

https://github.com/FORTH-ModelBasedTracker/MocapNET/assets/34035011/6feddba8-43cd-4aa7-93ea-c8c717abfa3a

justinjohn0306 commented 8 months ago

What am I doing wrong here? Also, does it support OpenPose's 25 model outputs with the face and the whole body?

AmmarkoV commented 8 months ago

Sorry about this, you are not doing something wrong, I think this is caused by programmatically setting the abdomen and chest bvh joints to zero to nail the armature ( done here https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/mnet4/MocapNET.py#L600 ) .. The mnet4 code in github was automatically copied by my dev branch and this got carried over..

Unfortunately due to preparing my PhD thesis and being on back-to-back trips for the last 3 months I am having trouble maintaining the code, resulting in the sub-par quality and problems you encountered.

That being said I will try to fix this..

AmmarkoV commented 8 months ago

Also another issue I remembered, but not sure if it is related is this : https://projects.blender.org/blender/blender-addons/issues/104549 The blender BVH importer uses a different internal rotation representation (quaternions[?]) and the nailed bones combined with this secondary issue could lead to what you observe.

AmmarkoV commented 8 months ago

Also, does it support OpenPose's 25 model outputs with the face and the whole body?

Yes The code uses OpenPose BODY25 for the body ( https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/mnet4/holisticPartNames.py#L477 ) The standard IBUG 69 point face ( https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/mnet4/holisticPartNames.py#L572 ) And the OpenPose hand landmarks ( https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/mnet4/holisticPartNames.py#L513 )

That being said mediapipe is used as a pose estimation source in the Google Collab since it is faster and more portable and the mediapipe joints are "cast" to the OpenPose ones. In principle any 2D joint source can be used as long as you maintain the order/labels in the links above.

AmmarkoV commented 8 months ago

As you can see with MNET4, the output joints are still messed up. I did a test with the shuffle.webm video, and here's what I've got (check the attachments). Also, it seems like the hip/torso area is locked, so it's having a hard time trying to animate the whole thing freely.

Just an update, this seems to be an issue with the blender script and not the main MocapNET BVH file, and it is caused by not setting some joints to get mirrored and them remaining set to 0.

https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/blender/blender_mocapnet.py

justinjohn0306 commented 8 months ago

As you can see with MNET4, the output joints are still messed up. I did a test with the shuffle.webm video, and here's what I've got (check the attachments). Also, it seems like the hip/torso area is locked, so it's having a hard time trying to animate the whole thing freely.

Just an update, this seems to be an issue with the blender script and not the main MocapNET BVH file, and it is caused by not setting some joints to get mirrored and them remaining set to 0.

https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/blender/blender_mocapnet.py

Just to make sure this is the case, I opened the generated bvh file using a 3rd party BVH utility BVHacker

https://github.com/FORTH-ModelBasedTracker/MocapNET/assets/34035011/2e64a0bc-ce97-48e2-9ac5-e29e3a2e34ba

And here's the debug output video generated:

https://github.com/FORTH-ModelBasedTracker/MocapNET/assets/34035011/d713f421-b8f5-4291-900a-4d349ef0ff66

ArEnSc commented 7 months ago

hey yeah I am expericing the same problem! I thought this was stable!?

notbugnotwork commented 2 months ago

正如您在 MNET4 中看到的,输出关节仍然很乱。我对视频进行了测试shuffle.webm,结果如下(查看附件)。此外,臀部/躯干区域似乎被锁定了,因此很难自由地为整个身体制作动画。

只是一个更新,这似乎是搅拌机脚本的问题,而不是主 MocapNET BVH 文件的问题,这是由于没有设置某些关节进行镜像并将它们保持设置为 0 造成的。

https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/mnet4/src/python/blender/blender_mocapnet.py

so what can i do in the blender , creat scripts to solve the problem , can you tell me how to finish the code,i havnt no idea