yohanshin / WHAM

MIT License
719 stars 78 forks source link

Bone representation in real time #78

Open nvog1 opened 7 months ago

nvog1 commented 7 months ago

Dear authors,

First of all, congratulations on your great work, i think it will lead to great advancements.

Is there a way of getting the results frame to frame instead of everything at the end? What I want to do is to update the pose of a SMPL model in Unity in real time with a camera recording live, by sending the joint positions and rotations as it records the video. Could that be done with WHAM? I think it could but I'm very new to neural networks and I was wondering if you would give me some guidance.

Thanks you very much for your attention.

yohanshin commented 7 months ago

Hi @nvog1 ,

I believe this should be feasible. Since all WHAM's operation only takes the information from the current frame the history, you can slightly modify the inference code to make it on-line estimation.

However, depending on the inference speed of ViTPose, SLAM, and image encoder, the actual runtime is not over 30 fps. So I would suggest allowing a few hundred milliseconds of latency and running it with a minimum batch size such as 4.