NVIDIA / MinkowskiEngine

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
https://nvidia.github.io/MinkowskiEngine
Other
2.46k stars 365 forks source link

scannet training and inference code #4

Closed mingminzhen closed 5 years ago

mingminzhen commented 5 years ago

Is it possible to make training and inference code for scannet dataset public? It will be very helpful.

chrischoy commented 5 years ago

Hi, we are working on the release. We have a separate repo for this and will update the issue when we release it!

BestSonny commented 5 years ago

Hi, we are working on the release. We have a separate repo for this and will update the issue when we release it!

@chrischoy Is there any estimated date for the release? Thank you very much.

chrischoy commented 5 years ago

Hi everyone,

I am very sorry for the delay. I couldn't seem to find time to sit down and refactor the code. I will try to release it soon, but training is very simple. However, try to incorporate the following components to reproduce the numbers on the paper:

  1. Chromatic jitter (Gaussian on color)
  2. Chromatic translation (Add a constant to all voxels in a scene)
  3. Random rotation (all 3D rotation, but mostly along the gravity direction)
  4. Random scaling of a scene
  5. Spatial translation (adding a constant vector to all coordinates)

Also, use the rotation average (extract features from various rotated scenes and averaging the final logit scores) when you evaluate the final semantic segmentation mIoU.

Happy research everyone!

chrischoy commented 5 years ago

Here's a simple tutorial for working with Pytorch dataset and dataloader: https://stanfordvl.github.io/MinkowskiEngine/demo/training.html#

wbhu commented 5 years ago

Thanks for the excellent work. But what is the exact MinkowskiNet42 configuration, there are only something like MinkUNet34, MinkUNet50 ...?

chrischoy commented 5 years ago

That is the MinkUNet34, which has 43 layers. Sorry for the confusion, but I implemented these residual networks based on the default res18, res34, etc. thats why I kept the name like this. The examples/indoor.py uses the same network.

On Sat, 14 Sep 2019 at 03:28 Hu Wenbo notifications@github.com wrote:

Thanks for the excellent work. But what is the exact MinkowskiNet42 configuration, there are only something like MinkUNet34, MinkUNet50 ...?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/StanfordVL/MinkowskiEngine/issues/4?email_source=notifications&email_token=ABGYLZLRMEY4YOOST6QJ43LQJS4FPA5CNFSM4HNTAJA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6WZD7Q#issuecomment-531468798, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGYLZLUJGQNDNEGZPHBWWLQJS4FPANCNFSM4HNTAJAQ .

wbhu commented 5 years ago

That is the MinkUNet34, which has 43 layers. Sorry for the confusion, but I implemented these residual networks based on the default res18, res34, etc. thats why I kept the name like this. The examples/indoor.py uses the same network. On Sat, 14 Sep 2019 at 03:28 Hu Wenbo @.***> wrote: Thanks for the excellent work. But what is the exact MinkowskiNet42 configuration, there are only something like MinkUNet34, MinkUNet50 ...? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4?email_source=notifications&email_token=ABGYLZLRMEY4YOOST6QJ43LQJS4FPA5CNFSM4HNTAJA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6WZD7Q#issuecomment-531468798>, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGYLZLUJGQNDNEGZPHBWWLQJS4FPANCNFSM4HNTAJAQ .

Thanks a lot!

chrischoy commented 5 years ago

Hi, I made the scannet training public. https://github.com/chrischoy/SpatioTemporalSegmentation/ It might take few more days to cleanup, but the scannet training works.

filaPro commented 2 years ago

Hi everyone,

I am very sorry for the delay. I couldn't seem to find time to sit down and refactor the code. I will try to release it soon, but training is very simple. However, try to incorporate the following components to reproduce the numbers on the paper:

  1. Chromatic jitter (Gaussian on color)
  2. Chromatic translation (Add a constant to all voxels in a scene)
  3. Random rotation (all 3D rotation, but mostly along the gravity direction)
  4. Random scaling of a scene
  5. Spatial translation (adding a constant vector to all coordinates)

Also, use the rotation average (extract features from various rotated scenes and averaging the final logit scores) when you evaluate the final semantic segmentation mIoU.

Happy research everyone!

Hi @chrischoy ,

Does this mean that metrics from the paper can be only obtained with test time augmentation?