menyifang / MIMO

Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
Apache License 2.0
1.34k stars 53 forks source link

Which pretrained human repose model was used? #22

Open luopeixiang opened 1 month ago

luopeixiang commented 1 month ago

In the "Canonical identity" section, you mention using "a pretrained human repose model" to transform the posed human image to the canonical A-pose result. However, I couldn't find which specific model was used.

Could you share which pretrained model you used for this? It'd be super helpful for understanding and potentially reproducing the work.

johndpope commented 1 month ago

simplx https://github.com/johndpope/MIMO-hack https://github.com/johndpope/MIMO-hack/blob/main/smplx_pose_frame_0.png https://github.com/johndpope/MIMO-hack/blob/main/test_motion.py

I got stuck on en3d stuff

luopeixiang commented 1 month ago

simplx https://github.com/johndpope/MIMO-hack https://github.com/johndpope/MIMO-hack/blob/main/smplx_pose_frame_0.png https://github.com/johndpope/MIMO-hack/blob/main/test_motion.py

I got stuck on en3d stuff

simplx https://github.com/johndpope/MIMO-hack https://github.com/johndpope/MIMO-hack/blob/main/smplx_pose_frame_0.png https://github.com/johndpope/MIMO-hack/blob/main/test_motion.py

I got stuck on en3d stuff

I believe the pretrained human repose model mentioned in the paper doesn't utilize any 3D-related algorithms. Its input is a human image in any pose, and the output is the same person in a standard A-pose.

It's likely one of the models listed in this repository:

https://github.com/Zhangjinso/Awesome-pose-transfer?tab=readme-ov-file

I plan to try out the CFLD model from this collection to see its performance.

https://github.com/YanzuoLu/CFLD

johndpope commented 1 month ago

maybe detectron2? - https://github.com/johndpope/MIMO-hack/blob/1c8b2d8bd935dc23e969a6af533eb18c32805e1d/utils.py#L172 though arguably sapien stuff supercedes this model. Screenshot from 2024-10-10 14-17-56

richieliuse commented 1 month ago

simplx https://github.com/johndpope/MIMO-hack https://github.com/johndpope/MIMO-hack/blob/main/smplx_pose_frame_0.png https://github.com/johndpope/MIMO-hack/blob/main/test_motion.py I got stuck on en3d stuff

simplx https://github.com/johndpope/MIMO-hack https://github.com/johndpope/MIMO-hack/blob/main/smplx_pose_frame_0.png https://github.com/johndpope/MIMO-hack/blob/main/test_motion.py I got stuck on en3d stuff

I believe the pretrained human repose model mentioned in the paper doesn't utilize any 3D-related algorithms. Its input is a human image in any pose, and the output is the same person in a standard A-pose.

It's likely one of the models listed in this repository:

https://github.com/Zhangjinso/Awesome-pose-transfer?tab=readme-ov-file

I plan to try out the CFLD model from this collection to see its performance.

https://github.com/YanzuoLu/CFLD

I've tried CFLD. It looks bad on wild data.

1zeryu commented 1 week ago

I've tried CFLD. Thx!