SilongYong / SQA3D

[ICLR 2023] SQA3D for embodied scene understanding and reasoning
Apache License 2.0
109 stars 3 forks source link

Question about the dataset #8

Open FUIGUIMURONG opened 11 months ago

FUIGUIMURONG commented 11 months ago

Thanks for your great dataset for 3D visual-language understanding. I am having some problems and I am apprecitated if you reply.

  1. I want to do some work on Embodied Question answering based on your released dataset. I need first view of the situation that green arrow points. I am wondering that if your released dataset has the situation description and its corresponding first view.

  2. As you describe as societal impact in E LIMITATION AND POTENTIAL IMPACT: The QA tasks also examine a wide spectrum of capabilities of embodied agents in household domains, making it a great benchmark for testing these household assistant robots. The robot might have the ability to navigate. How the dataset support agent navigation work?

jeasinema commented 9 months ago

Hi,

I am wondering that if your released dataset has the situation description and its corresponding first view.

We didn't. But it should be possible by leveraging the location and orientation annotation and the scene id. Just load the 3D scene with open3d or blender, then render the view accordingly. Let me know if you need any help on this.

How the dataset support agent navigation work?

As indicated in Fig. 2 of the paper, we have question like How many chairs will I pass by to open the window from other side of the desk?. That is, the agent should be able to provide the right answer if it can plan the correct path -- we evaluate the capacity of planning in an implicit fashion.

-XM

mengfeidu commented 7 months ago

Hi, @jeasinema. I have the same need about the first view of the situation. But I am not falimilar with open3d or blender. Could you give me some help about how to convert the location and orientation annotation to the pose matrix for camera to render the corresponding view. I have tried the approach in the file utils/visualize_data.py like image where the point cloud is centered. But the rendering result is blank......