Training with Custom Dataset

Prerequisites

[x] I have searched Issues and Discussions but cannot get the expected help. I have also searched through issues on this github page.
[x] I have read the (https://mmdetection3d.readthedocs.io/en/latest/2_new_data_model.html) but cannot get the expected help.

Task I am trying to train fcaf3d with a custom dataset that emulates the sunrgbd dataset.

Environment Docker Image Build from DockerFile. GPU: 2080ti

Steps and Issues To emulate the SUNRGB-D Dataset:

[x] I generated the binary file as a numpy ndarray with 6 dimensions. (x, y, z, r, g, b) e.g in human readable format (point_cloudcustom.txt). Which is supposed to emulate the binary file from the create_data.py in human readable format ( point_cloud.txt *Github doesn't like me attaching a binary file. Also I only have 733 pointclouds, significantly less than the sunrgbd dataset.
[x] I also generated the corresponding pkl file, based of the details provided from; https://mmdetection3d.readthedocs.io/en/latest/datasets/sunrgbd_det.html. The only difference is you have to add two classes at minimum otherwise it gives me a Assertion error like so:
```
File "/mmdetection3d/mmdet3d/core/bbox/structures/base_box3d.py", line 47, in __init__
assert tensor.dim() == 2 and tensor.size(-1) == box_dim, tensor.size()
AssertionError: torch.Size([7])
```
Example pkl file in human readable format: pklhumanreadable.txt

Reproduces the problem - Command or script python tools/train.py configs/fcaf3d/fcaf3d_sunrgbd-3d-10class.py

Reproduces the problem - issue on 2080 ti issue as log: 2080ti_train_issue.txt

  File "/mmdetection3d/mmdet3d/models/backbones/me_resnet.py", line 89, in forward
    x = self.layer2(x)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/MinkowskiEngine/modules/resnet_block.py", line 59, in forward
    out = self.conv2(out)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/MinkowskiEngine/MinkowskiConvolution.py", line 321, in forward
    input._manager,
  File "/opt/conda/lib/python3.7/site-packages/MinkowskiEngine/MinkowskiConvolution.py", line 84, in forward
    coordinate_manager._manager,
RuntimeError: CUDA out of memory. Tried to allocate 392.00 MiB (GPU 0; 10.75 GiB total capacity; 8.93 GiB already allocated; 201.06 MiB free; 9.09 GiB reserved in total by PyTorch)

Reproduces the problem - issue on A100 I also launched an instance with a A100 to verify if it genuinely needed more memory.

Thoughts I am quite confused as to why my dataset is more memory hungry, given that it is only 733 points. Also it trained fine with the SUNRGB-D Dataset on a 2080ti.

SamsungLabs / fcaf3d

Training with Custom Dataset #49