SiTH-Diffusion / SiTH

[CVPR 2024] SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion
https://ait.ethz.ch/sith
MIT License
76 stars 3 forks source link

Result variation, is it expected? #3

Closed hangg7 closed 2 months ago

hangg7 commented 2 months ago

Hi @azuxmioy @SiTH-Diffusion,

Congrats on your great work and thanks for releasing your code and providing a nice demo for us to play with. It is super helpful to understand and use your method!

I manage to get SiTH running on a bunch of my own test images, but it seems like result varies a lot in terms of poses. E.g., for a same person doing different poses from a video, SiTH would work on some "simple" poses (like almost standing, similar to your teaser figure) but fail on other poses. Here is what I meant:

Simple pose works alright image

Harder pose like crossing arm breaks image

Well but not all the time, here is basically a crossing arm pose but in the opposite direction, it works to some degree image

My question: is this behavior expected? Have you seen anything like this before?

hangg7 commented 2 months ago

Also FYI, I was trying to use your hohs/SiTH-diffusion-2000 model but have run into this issue

OSError: hohs/SiTH_diffusion-2000 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login`.

Login doesn't solve this.

azuxmioy commented 2 months ago

Hi,

My question: is this behavior expected? Have you seen anything like this before?

This is as expected since the model is finetuned with only 500 3D scans. The bottleneck lies in unseen and challenging poses.

The sweet spot of our pipeline is front-facing human bodies. Reconstructing humans in challenging poses is not the focus because (1) pose fitting may be challenging and ambiguous, and (2) textures in self-occluded regions cannot be generated.

We will soon release our training code and you are welcome to analyze the robustness to unseen poses in more detail.

Also FYI, I was trying to use your hohs/SiTH-diffusion-2000 model but have run into this issue

The models are located at https://huggingface.co/hohs/SiTH-diffusion-2000 Perhaps you can download the models and load them locally.

Thanks!

hangg7 commented 2 months ago

Thanks for your clarification! That makes sense 😄