changhaonan / A3VLM

[CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`
67 stars 3 forks source link

model inference #3

Closed 3202336152 closed 2 months ago

3202336152 commented 3 months ago

Thank you for your help. I have successfully solved the data_gen problem, but there are still some problems with model training and verification. I looked at the README in the model folder and still have some doubts.

  1. In the a3vlm.yaml file settings:

META: path: 'ManipVQA2/vqa_tasks_v17_525_3d_full_size_sd/all_parts_3d_det_tasks_train_53087.json' type: 'image_depth_text' ratio: 0.2 path: 'ManipVQA2/vqa_tasks_v17_525_3d_full_size/all_parts_3d_det_tasks_train_53341.json' type: 'image_depth_text' ratio: 1 Except for the last number 53087.json and 53341.json are different, everything else is the same, but the only one I generated with all_parts_3d_det_tasks_train as the prefix is ​​78671.json, how do I configure it? Do I only need to configure the first four of a3vlm.yaml? The files with the prefix all that I generated are as follows: all_parts_3d_det_tasks_train_78671.json all_parts_3d_det_tasks_val 11051json all_parts_det_tasks_78671json all_parts_det_tasks_11051.json

  1. The eval_affordance_v2.py file path problem, the following file paths correspond to which file paths after my dataset is generated, I hope you can optimize the code and give a file name that is more in line with the current A3VLM model.

partnet_dataset_root = '/mnt/petrelfs/XXXXX/data/ManipVQA2/jsons_vqa_tasks_fix_angle/' partnet_dataset_root_depth = "/mnt/petrelfs/XXXXX/data/ManipVQA2/vqa_tasks_v11_0508" partnet_dataset_root_8points = "/mnt/petrelfs/XXXXX/data/ManipVQA2/vqa_tasks_v15_521_3d"

ds_collections = { "demo": { "train": "/mnt/petrelfs/XXXXX/data/ManipVQA2/eval_demo/demo_det_all.json", "test": "/mnt/petrelfs/XXXXX/data/ManipVQA2/eval_demo/demo_det_all.json", "max_new_tokens": 2048, "use_answer_extractor": True, },

"demo2": { "train": "/mnt/petrelfs/XXXXX/data/ManipVQA2/eval_demo/demo_joint_rec.json", "test": "/mnt/petrelfs/XXXXX/data/ManipVQA2/eval_demo/demo_joint_rec.json", "max_new_tokens": 1024, "use_answer_extractor": True, } }

Thank you for your excellent work. Sorry for bothering me. I look forward to your reply.

SiyuanHuang95 commented 3 months ago

Thanks for your interest in our project.

  1. They are actually not the same file, if you check the file path, you can find the images are from different folders, e.g.:

vqa_tasks_v17_525_3d_full_size_sd vs vqa_tasks_v17_525_3d_full_size, the sd means the stable diffusion generated ones.

And yes, after generation, you can add the actual file path to the yaml path.

  1. The demo ones are something similar with the generated json. I will add some example in the later future.
3202336152 commented 3 months ago

Thank you for your reply.

  1. I don't have the json generated by _sd stable diffusion. Do I need to regenerate the dataset in the data_gen folder? Is this necessary?

  2. If I want to infer the model, can the existing code be used? I don't know how to fill in the path of the json file.

SiyuanHuang95 commented 3 months ago
  1. SD helps, but not necessary.
  2. If you just want to infer the model, you can ignore most parts of the codes.

You can create the demo.json very easily by copying some lines of the original json file, e.g. your

all_parts_3d_det_tasks_train_78671.json all_parts_3d_det_tasks_val 11051json all_parts_det_tasks_78671json all_parts_det_tasks_11051.json

then, add the demo.json to the eval_affordance_v2.py

SiyuanHuang95 commented 2 months ago

@3202336152 Please check the demo useage in: script