JasonQSY / 3DOI

[ICCV 2023] Understanding 3D Object Interaction from a Single Image
38 stars 2 forks source link

Regarding AffordnaceLLM #4

Open wj-on-un opened 3 weeks ago

wj-on-un commented 3 weeks ago

Apologize for the questions about your another significant work... Since I have no way to contact you separately, I am posting here after seeing the related issue.

I am interested in your another paper AffordanceLLM: Grounding Affordance from Vision Language Models and am currently working on its implementation.

Thankfully, i was able to download the hard split of the benchmark. But I wonder how to generate the Easy and hard split data.

The following part of the paper: Easy split

Hard split

Could you please tell me detail about the weakly supervised method part? (which images are you using and so on...) And if you use data from a weakly supervised method, how did you get the GT data needed for affordance prediction and learning?

JasonQSY commented 3 weeks ago

For these data you'll really need to re-process the dataset. To make it possible I've released some data processing and baseline code. https://github.com/JasonQSY/AffordanceLLM I'm sorry I'm not able to debug to make sure the code is easy to run. I've graduated recently and lost a lot of access to specific machines. If you find any issues I'll appreciate a PR. If you plan to release your implementation of AffordanceLLM in the future I'm happy to put it on the project website and acknowledge your contribution.

Fully-supervised setting:

Weakly-supervised setting: