changhaonan / A3VLM

[CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`
89 stars 3 forks source link

Questions about inference. #4

Closed iamjoong9 closed 4 months ago

iamjoong9 commented 4 months ago

Hello,

Thank you for your excellent research!

I would like to run a real-world demo using your project, similar to the ones shown in Real-World Demo

  1. To do this, can I just use the checkpoints from A3VLM-Eval and A3VLM as mentioned in the Inference Instruction? Or do I have to follow all the training procedures?

  2. For the llama_path in model/accessory/scripts/a3vlm_infer.sh, should I use the path to the files downloaded from llama2-13b?

  3. I see that the eval_affordance_v2.py function is used for inference. Do I also need to provide values for "partnet_dataset_root", "partnet_dataset_root_depth", "partnet_dataset_root_8points", and "ds_collections"? If so, could you please explain in detail what values should be used for these parameters?

Thank you again for your wonderful research. I am relatively new to this field, so I would greatly appreciate it if you could provide a detailed explanation.

changhaonan commented 4 months ago

We will provide a new inference example this week.

SiyuanHuang95 commented 4 months ago

@JoongKu-Lee Please check our newly updated version, the demo usage could be found in model/accessory/demo_data/xxx.json, the usage script could refer to the a3vlm_7B_infer_test.sh

iamjoong9 commented 4 months ago

Thanks a lot!