OpenRobotLab / EmbodiedScan

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
https://tai-wang.github.io/embodiedscan/
Apache License 2.0
395 stars 26 forks source link

How to run demo.ipynb on custom data? #36

Closed zxyhhhappy closed 3 months ago

zxyhhhappy commented 3 months ago

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Thank you for your work. I have successfully run the demo.ipynb. Now, I would like to know how to run the demo.ipynb on custom data for inference, such as how to obtain my own data's "poses. txt" and "axis_align_matrix. txt"?

Suggest a potential alternative/fix

No response

mxh1999 commented 3 months ago

The camera position is obtained by a SLAM algorithm, which is usually built into the sensor. The axis align matrix is currently manually adjusted by humans.

Tai-Wang commented 3 months ago
  1. For poses: We use the camera poses recorded by the internal odometry of Kinect. If you only have an RGB-D sensor without IMU/odometry, you can try classical visual SLAM algorithms to obtain the pose information.
  2. For axis-aligned matrix: We temporarily keep the heuristic axis-aligned matrix to get better performance during inference (you can also derive one by yourself, just make the origin at the center of the scene and z-axis upright, better with x and y aligning with the wall). You can also use the coordinate system of the first frame, i.e., the identity matrix as the axis-aligned matrix, to test the algorithm. From our experience, using any coordinate system whose z-axis is vertical can affect the performance less.
zxyhhhappy commented 3 months ago
  1. For poses: We use the camera poses recorded by the internal odometry of Kinect. If you only have an RGB-D sensor without IMU/odometry, you can try classical visual SLAM algorithms to obtain the pose information.
  2. For axis-aligned matrix: We temporarily keep the heuristic axis-aligned matrix to get better performance during inference (you can also derive one by yourself, just make the origin at the center of the scene and z-axis upright, better with x and y aligning with the wall). You can also use the coordinate system of the first frame, i.e., the identity matrix as the axis-aligned matrix, to test the algorithm. From our experience, using any coordinate system whose z-axis is vertical can affect the performance less.

get it! Thank you again for your excellent work.