cleong110 / semantic-sign-language-search

Search a folder of sign language videos semantically
MIT License
0 stars 0 forks source link

Try different Pose Estimators? #14

Open cleong110 opened 1 month ago

cleong110 commented 1 month ago

Sapiens

Amit says:

if you wanna run sapiens, i already made the effort to figure it out but not to convert to .pose Today at 10:30 AM https://github.com/facebookresearch/sapiens/issues/125 *if you care specifically about hands, i made a benchmark: https://github.com/sign-language-processing/3d-hands-benchmark if you run 2d models, you can calculate CCE, and if you run 3D models you can calculate MACE (edited)

cleong110 commented 1 month ago

Saul on Discord:

10/13/2024 9:56 PM
Thanks @Colin Leong for the presentation. I noticed @jnemecek mentioned [DWPose](https://github.com/IDEA-Research/DWPose) which is nice and easy to use, and outputs in COCO-WholeBody format. 
I tried it out in the Chameleon pipeline, but the DWPose hand tracking performs worse that the model I trained years ago,  and DWPose is over ~10x model size.

Also noticed that the [SOTA rankings for COCO-WholeBody](https://paperswithcode.com/sota/2d-human-pose-estimation-on-coco-wholebody-1) says 
DWPose is effectively 2nd place behind a [2024 "Sapiens" model](https://github.com/facebookresearch/sapiens), which is a very large transformer model, but looks like great output.