SiyuanHuang95 / ManipVQA

ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
40 stars 1 forks source link