NVlabs / affordance_diffusion

Codes for "Affordance Diffusion: Synthesizing Hand-Object Interactions"
https://github.com/NVlabs/affordance_diffusion/blob/master
101 stars 5 forks source link

predict hand poses #19

Open yangfan293 opened 4 weeks ago

yangfan293 commented 4 weeks ago

Thank you for your wonderful work! I would like to ask about the "Our method outperforms generic image generation baselines, and the extracted hand poses from our HOI synthesis are favored in user studies against baselines that are trained to directly predict hand poses.“ What does this phrase mean?Because I think using your method to do "predict hand poses" is a reasonable thing to do. What do you think is the difference between the two tasks and why do you come to this conclusion?Or what's difficult about using your method to "predict hand poses"? image