haofanwang / CLIFF

This repo equips the official CLIFF [ECCV 2022 Oral] with better detector, better tracker. Support multi-person, motion interpolation, motion smooth and SMPLify fitting.
Apache License 2.0
147 stars 16 forks source link

Question about the infill && smooth stage,can these work on pose and global_t? #5

Closed jihg88 closed 1 year ago

jihg88 commented 1 year ago

Thank for your nice work,i have made a little test and get the npz results,but i don't know the meaning of pred_joints, are these joints value means 3d coordinates ?

And I find the mesh video results after stage( infill&&smooth ) is the same as the not infill,is this stage only works on pred_joints? Actually i need to use your results to generate a bvh file,and i can use a project which only needs the pose and global_t info to generate the bvh file . here is the project i used (https://github.com/KosukeFukazawa/smpl2bvh) Because of jittering,the bvh file seems bad,so i want to know how can i add a smooth to the pose and global_t?

jihg88 commented 1 year ago

i checked the infill and smooth part, in demo.py line 332" existed_list = copy.copy(choose_frame)",and the choose_frame is actually all the frames in video,so this makes no frame will be infill and smooth, and result is the same as origin .

haofanwang commented 1 year ago
  1. pred_joints are 3D coordinates in SMPL joints format.
  2. The infill and smooth only works on pred_joints now, not on SMPL pose and transition. But it should works for them.
  3. The choose_frame is not all frame ids over the video, it only has corresponding index of a specific person.
jihg88 commented 1 year ago

Thank for your reply,About the choose_frame i still have some questions. In demo.py line 319-326,for my test video the 'person'==0(single person) ,so person_id=person,and all the frames over the video will be appended in the choose_frame. I want to know Should the 331 line be placed later than line 337?

haofanwang commented 1 year ago

If you only have one person in the video, then choose_frame will be identical to detection_all. The first item of detection_all[i] is the frame index. Then choose_frame should be like [0,1,2,...N]. But if there exists missing detection, the length of choose_frame will be less than length of imgs (the total video frames), or infill will do nothing.

haofanwang commented 1 year ago

By the way, I have tested infill SMPL poses and transitions, it works fine and you can find the motions are much smoother. You can visualize it on Blender.

jihg88 commented 1 year ago

Do you mean only missing frame will the infill and smooth take effect? The infill is a linear interpolation,than we can use smooth stage to make the infill smoother( correct me if i am wrong) so Is there a way to reduce the pose jitter?I found a lot model has this problem,but their model's Performance value(like MPJPE) are great.Are the 3d human pose models only focus on the 3d coordinates?(Forgive me for not knowing much)

jihg88 commented 1 year ago

感谢您的工作,我测试了只对pose进行平滑的结果,抖动确实好转了。不过还有个问题,global_t的前后信息貌似不稳定,我生成的bvh文件里正面视角很好,但侧面视角存在晃动,幅度还挺大的。想问下这个是正常现象吗?

haofanwang commented 1 year ago

It is normal, I also find that global transition is not stable.