SiyuanHuang95 / ManipVQA

[IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
62 stars 3 forks source link

Datasets #4

Open RussRobin opened 3 months ago

RussRobin commented 3 months ago

Hi , great thanks for this wonderful work.

I wonder if you plan to release the train and test set used in your work: images and QA files. Also, what's the relationship between handal_process.py and handal_grounding_tasks.py?

Millions of thanks!

SiyuanHuang95 commented 3 months ago

Hi, thanks for your interest in our project!

  1. The dataset issue: Yes, we will release all the dataset stuff, maybe in 1-2 weeks. Since I am busy with another DDL now, no time to clean the dataset currently. I will let you know when the dataset is uploaded to HF.

  2. grounding_tasks is the generation result from ChatGPT, where we add some language description to the original annotation.

RussRobin commented 3 months ago

Lots of thanks. Really looking forward to it!

SiyuanHuang95 commented 3 months ago

Hi, I double check the dataset script provided in this repo, thought it would be enough for the VQA dataset generation. When any problem rasies, just let me know.

We also open-sourced another project recently, check it when you need the ability on 3D articulated object: https://github.com/changhaonan/A3VLM/tree/main/model

RussRobin commented 3 months ago

Thanks! Will you open-source the whole VQA dataset, or only dataset generation prompts are provided?

SiyuanHuang95 commented 3 months ago

The prompts used in the both projects are only for the grounding tasks, and the mount is limited.

For the other stuff, you can use the script we provide to create locally