Input feature size and kernel size mismatch in torchsparse_cuda function

ruomingzhai commented 2 years ago

Dear Dr Robert:

Thanks again for your excellent project.

I met some error in the training stage. When I run the scripts/train_scannet.sh with modelname 'Res16UNet34-PointPyramid-early-ade20k-interpolate', I get into a error as follows: Traceback (most recent call last): File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module> cli.main() File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/root/.local/share/code-server/extensions/ms-python.python-2021.8.1159798656/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("__main__")) File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/root/.local/conda/envs/py38cu102/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/share/code/DeepViewAgg/scannet_preprocess.py", line 163, in <module> initial_trainer() File "/root/share/code/DeepViewAgg/scannet_preprocess.py", line 90, in initial_trainer trainer.train() File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 147, in train self._train_epoch(epoch) File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 202, in _train_epoch self._model.optimize_parameters(epoch, self._dataset.batch_size) File "/root/share/code/DeepViewAgg/torch_points3d/models/base_model.py", line 245, in optimize_parameters self.forward(epoch=epoch) # first call forward to calculate intermediate results File "/root/share/code/DeepViewAgg/torch_points3d/models/segmentation/sparseconv3d.py", line 44, in forward features = self.backbone(self.input).x File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/share/code/DeepViewAgg/torch_points3d/applications/sparseconv3d.py", line 228, in forward data = self.down_modules[i](data) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/share/code/DeepViewAgg/torch_points3d/modules/multimodal/modules.py", line 84, in forward mm_data_dict = self.forward_3d_block_down( File "/root/share/code/DeepViewAgg/torch_points3d/modules/multimodal/modules.py", line 171, in forward_3d_block_down x_3d = block(x_3d) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/share/code/DeepViewAgg/torch_points3d/modules/SparseConv3d/modules.py", line 165, in forward out = self.conv_in(x) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/.local/lib/python3.8/site-packages/torchsparse/nn/modules/conv.py", line 58, in forward return conv3d(inputs, File "/root/.local/lib/python3.8/site-packages/torchsparse/nn/functional/sparseconv.py", line 183, in conv3d output_features = sparseconv_op(features, kernel, idx_query[0], File "/root/.local/lib/python3.8/site-packages/torchsparse/nn/functional/sparseconv.py", line 57, in forward torchsparse_cuda.sparseconv_forward(features, out, kernel, ValueError: Input feature size and kernel size mismatch

I check the feature size of input x i.e., torch.Size([53419, 513]). It seems following the configuration files. I dont't know why this happened?

drprojects commented 2 years ago

Hi @ruomingzhai, thanks for weeding out code errors and sharing them ! Sorry I could not help earlier, I am out of office until next Monday. I assume you have found the solution to your problem by modifying the model configuration ? Shall I update the released code when I come back ? If so, could you please share the solution you have found ? Thanks in advance !

ruomingzhai commented 2 years ago

So sorry to bother you in your vacation. I closed the issue because it is an unintentional mistake to manually revised the num_features to a constant number(4), which causes this error. I don't know how delete this issue. Just ignore my silly behavior!LOL!🤣

drprojects commented 2 years ago

Alright, no worries ! FYI, if you need to adapt the number of input features, you may want to check the datasets train, val and test transforms. These will decide how many features the input points will carry

drprojects / DeepViewAgg

Input feature size and kernel size mismatch in torchsparse_cuda function #6