Hi @AK391, thanks for setting up the demo! Couple suggestions if you're interested:
I see you used a totoro image as the example; the pose estimator actually works better with larger full-body images. I'd recommend picking some from danbooru, or the megumin example in ./_samples/megumin.png (sauce).
The actual pose estimation part is the focus of the paper, but in my opinion the tagging and pose-based retrieval are cooler. I (humbly) think the tagger is the best one out there, and the retrieval is a pretty cool application. (The segmentation is kinda meh.) If you're up for it, I think demos for those two would be pretty cool.
Thanks again for the demo, I think it makes the work a lot more visible for people!
Web Demo on Huggingface Spaces using Gradio