isl-org / Open3D-ML

An extension of Open3D to address 3D Machine Learning tasks
Other
1.86k stars 319 forks source link

Fix bug of multiple pre-processing when segmentation (PyTorch) #645

Open Lionelsy opened 8 months ago

Lionelsy commented 8 months ago

It is very slow in performing segmentation inference.

531

234

And, it is because the dataloader will apply multiple data preprocessing if self.cache_convert is None. https://github.com/isl-org/Open3D-ML/blob/fcf97c07bf7a113a47d0fcf63760b245c2a2784e/ml3d/torch/dataloaders/torch_dataloader.py#L77-L83

When running the run_inference method, the cache_convert of dataloader is None. https://github.com/isl-org/Open3D-ML/blob/fcf97c07bf7a113a47d0fcf63760b245c2a2784e/ml3d/torch/pipelines/semantic_segmentation.py#L143-L147

This leads to extreme slowness in performing reasoning.

I've added a get_cache method to provide cache to avoid slowdowns caused by multiple preprocessing during inference.

I tested it on a GV100 GPU with RandLA-Net on the Toronto3D dataset. Inferencing time for a single scene is only two minutes and 37 seconds. Reasoning is considerably faster than before

After: test 0/1: 100%|██████████████████████████████████████████████████████| 4990714/4990714 [02:37<00:00, 31769.86it/s]

Before: test 0/1:   4%|██                                                     | 187127/4990714 [05:12<2:19:39, 573.27it/s]
rejexx commented 3 months ago

I applied this fix on a local fork with a custom dataset and saw RandLA-NET inference go from 27 hours to 6 minutes. I can't thank you enough for sharing.

ssheorey commented 3 months ago

Hi @Lionelsy thanks for debugging this and submitting a PR. I have a question:

Lionelsy commented 3 months ago

Hi @Lionelsy thanks for debugging this and submitting a PR. I have a question:

  • if cache_convert is None, it looks like preprocess is applied only once in line 81 of torch_dataloader.py. Can you point out the multiple pre-processing?

Thank you for your continued contributions to Open3D!

In fact, the __getitem__ method in the torch_dataloader.py is called multiple times during the inference process, and the preprocess method (a very time-consuming step) is called again each time.

If we use the cache_convert to store the preprocessed data, it will save much time.