OpenMask3D / openmask3d

MIT License
184 stars 13 forks source link

ScanNet200 data structure #12

Closed YilmazKadir closed 9 months ago

YilmazKadir commented 9 months ago

I try to evaluate OpenMask3D on the ScanNet200 dataset. I did not quite understand how I can make sure the data is in the format provided here:

image

The raw dataset has the following format: image

When I run the preprocessing script I only get the preprocessed numpy files: image

aycatakmaz commented 9 months ago

Hi Kadir,

The ScanNet preprocessing script we provided in the repository is for being able to run the mask-proposal network, it is the same preprocessing script used in Mask3D. So that script only results in the preprocessed numpy files you mentioned.

The RGB-D images for the ScanNet dataset can be obtained using the ScanNet toolbox to extract RGB-D frames from the .sens files for each scene. You can find further details here and here.

Alternatively, OpenScene release includes an easily downloadable version for RGB-D images and camera poses for the ScanNet dataset. Please note that if you would like to run OpenMask3D on those frames, you might need to change the frequency parameter in our config, based on the frame rate used in OpenScene. I think those frames are already sampled and saved based on that chosen rate (10 if I remember correctly). In that case, the frequency parameter in our config should be set to 1. Nevertheless, I would recommend you double-check the frame rate in OpenScene to make sure it corresponds to our setting.

I hope this answers your question!

aycatakmaz commented 9 months ago

I am closing the issue, feel free to re-open it if any other questions related to this topic emerge!

YilmazKadir commented 8 months ago

Thank you for the answer. OpenScene uses every 20 frames so I had to process it as in the SensReader in ScanNet toolbox. One thing important to mention is that RGB images need to be resized to 640x480. After this, I got the same results as the ScanNet table in the paper.