Open nvog1 opened 7 months ago
Hi @nvog1 ,
I believe this should be feasible. Since all WHAM's operation only takes the information from the current frame the history, you can slightly modify the inference code to make it on-line estimation.
However, depending on the inference speed of ViTPose, SLAM, and image encoder, the actual runtime is not over 30 fps. So I would suggest allowing a few hundred milliseconds of latency and running it with a minimum batch size such as 4.
Dear authors,
First of all, congratulations on your great work, i think it will lead to great advancements.
Is there a way of getting the results frame to frame instead of everything at the end? What I want to do is to update the pose of a SMPL model in Unity in real time with a camera recording live, by sending the joint positions and rotations as it records the video. Could that be done with WHAM? I think it could but I'm very new to neural networks and I was wondering if you would give me some guidance.
Thanks you very much for your attention.