JiahuiLei / GART

GART: Gaussian Articulated Template Models
https://www.cis.upenn.edu/~leijh/projects/gart/
MIT License
243 stars 12 forks source link

Results on UBC data is noisy #2

Open weihaosky opened 9 months ago

weihaosky commented 9 months ago

Hi, thanks for this excellent work. The results on people_snapshot_public are quite good. But when I run on UBC data, the results are quite noisy, as: cano-pose May I ask where is the problem? Thanks

JiahuiLei commented 9 months ago

Yes, we do observe the noisy T-pose results, but when we look at the training poses or some poses near human poses in the input video, it will look better. We guess this may be due to the highly dynamic clothes are hard to generalize to novel poses given the very limited observations in UBC. Also, note that in-the-wild UBC seqs with dynamic clothes are extremely more challenging than People-Snapshot and ZJU-MoCap because the poses are inaccurate and quite noisy, as well as the pose distribution is singular and only a few side view frames are provided in the video. However, when you look at the baseline, it is even worse. We hope our first small step reveals some new challenges for this in-the-wild problem.

Another tip you may notice in our code is that I intentionally leave the SD guidance and real video-fitting steps in one file. Our very early results suggest that a hybrid of the SD guidance and the real fitting will largely help to address this issue. But I haven't had time to implement this in the code release.