remyxai / VQASynth

Compose multimodal datasets 🎹
https://twitter.com/smellslikeml/status/1756723056675094726
217 stars 13 forks source link

Add Human Pose Estimation pipeline #31

Open smellslikeml opened 1 week ago

smellslikeml commented 1 week ago

Expand on https://huggingface.co/datasets/salma-remyx/PoseText for QA pairs to fine-tune Molmo for body keypoint estimation image