remyxai / VQASynth

Compose multimodal datasets 🎹
https://twitter.com/smellslikeml/status/1756723056675094726
217 stars 13 forks source link

Improving distance scaling and processing #23

Closed salma-remyx closed 1 month ago

salma-remyx commented 1 month ago

This PR will skew the spatialvqa.yaml pipeline to be more oriented torwards producing q&a pairs that feature a numerical distance measurement. For samples where measurements are too small, we fallback to spatial relation q&a samples.

To test these changes, checkout this branch and run:

bash run.sh