OpenRobotLab / EmbodiedScan

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
https://tai-wang.github.io/embodiedscan/
Apache License 2.0
395 stars 26 forks source link

[Docs] Code to process data into a format usable by the demo #14

Closed LZ-CH closed 4 months ago

LZ-CH commented 4 months ago

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Hi, can you please post the code that processes the data into a usable format for the demo?

Suggest a potential alternative/fix

No response

LZ-CH commented 4 months ago

Such as camera.json.

mxh1999 commented 4 months ago

I apologize for omitting the raw data of our open-world demo from the distribution, we'll add it as soon as possible. Please refer to demo.ipynb for instructions on how to use it.

Thanks again for pointing this out!

LZ-CH commented 4 months ago

I apologize for omitting the raw data of our open-world demo from the distribution, we'll add it as soon as possible. Please refer to demo.ipynb for instructions on how to use it.

Thanks again for pointing this out!

Okay, thank you very much. I also want to ask, in the 3D detection task, before the data is input to the model, it will be scaled. During the model inference stage, the coordinates of prediction will be scaled and affine transformed to the original coordinate system? Because I didn’t find the relevant code for this step, so I want to ask this question!

Thank you again for this great work.

mxh1999 commented 4 months ago

I apologize for omitting the raw data of our open-world demo from the distribution, we'll add it as soon as possible. Please refer to demo.ipynb for instructions on how to use it. Thanks again for pointing this out!

Okay, thank you very much. I also want to ask, in the 3D detection task, before the data is input to the model, it will be scaled. During the model inference stage, the coordinates of prediction will be scaled and affine transformed to the original coordinate system? Because I didn’t find the relevant code for this step, so I want to ask this question!

Thank you again for this great work.

I'm a bit unclear about the specific steps you're referring to. If you're asking about the image processing steps before the backbone, you can refer to the Det3DDataPreprocessor module. It handles the necessary pre-processing of the images.

On the other hand, if you're inquiring about the transformation of predicted boxes from ego-centric coordinates to global coordinates, you can look into the single_scene_multiclass_nms function within the bbox_head module. It handles the post-processing and transformation of the predicted bounding boxes.

LZ-CH commented 4 months ago

Thank you very much. With your help, I found the data post-processing process. Because I want to implement a 3D detector for use in real scenes based on your work, I pay more attention to the data input and output format of the demo. Looking forward to your additions on demo data preparation.

Thank you again for your help!

mxh1999 commented 4 months ago

@LZ-CH We have uploaded the raw data of our open-world demo, download it from Google Drive or BaiduYun.

LZ-CH commented 4 months ago

@mxh1999 Thank you for your assistance. I'm thinking about trying to scan a room like you do in OpenScan using an RGB-D camera (Depth Camera D435i). Could you possibly recommend any projects or academic papers that might guide me in generating my own environment ?

mxh1999 commented 4 months ago

@mxh1999 Thank you for your assistance. I'm thinking about trying to scan a room like you do in OpenScan using an RGB-D camera (Depth Camera D435i). Could you possibly recommend any projects or academic papers that might guide me in generating my own environment ?

We use Kinect to generate the mesh of the environment. If you want to work with your own data, this may be helpful.

LZ-CH commented 4 months ago

@mxh1999 Thank you for solving my problem and wish you a happy life!