dvlab-research / LargeKernel3D

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs (CVPR 2023)
https://arxiv.org/abs/2206.10555
Apache License 2.0
193 stars 8 forks source link

stride_valid assert faild. non-contiguous stride can't handled #10

Open study1994 opened 1 year ago

study1994 commented 1 year ago

Here is my log with a bug, how can I solve this problem?

[Exception|indice_conv|subm]feat=torch.Size([137566, 16]),w=torch.Size([7, 7, 7, 16, 16]),pair=torch.Size([2, 343, 137566]),pairnum=tensor([ 5001,  4071,  2915,  2876,  2559,  2547,  2521,  4382,  5716,  4867,
         3664,  3167,  3121,  2835,  3526,  5018,  7021,  6005,  4319,  3926,
         3575,  3916,  4679,  6829,  9400,  7201,  5169,  4423,  4106,  4546,
         5253,  7129,  7777,  5756,  3978,  3274,  3504,  3874,  4283,  5309,
         6038,  4567,  2878,  2974,  3229,  3410,  3441,  4449,  5227,  4487,
         3883,  3429,  3430,  3017,  2970,  2785,  4539,  5350,  4998,  4443,
         3730,  3469,  3128,  4072,  5174,  6710,  6427,  4945,  4201,  3716,
         4024,  4918,  6864,  8956,  6847,  5047,  4072,  3896,  4419,  5554,
         7007,  7141,  5674,  4268,  3469,  3878,  4567,  4992,  5347,  5751,
         4694,  3185,  3406,  3888,  4043,  3898,  4522,  4774,  8162,  7692,
         7195,  6938,  6410,  6041,  5768,  8543,  8755,  8271,  7939,  7052,
         6395,  5980,  8122,  9124, 10307, 10506,  8607,  7326,  6615,  8280,
         9333, 11303, 13434, 10871,  8458,  6993,  7694,  8293,  9598, 10743,
         9695,  7938,  6372,  6528,  6814,  7356,  7580,  7401,  7089,  6270,
         5683,  5811,  6181,  6397,  5909,  5726,  5781, 43530, 43224, 40874,
        39942, 38939, 38032, 36751, 43545, 50091, 50867, 47721, 45472, 43093,
        39536, 44726, 52853, 62406, 66190, 56487, 49107, 44248, 47333, 54066,
        66982,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0], device='cuda:0', dtype=torch.int32),act=137566,algo=ConvAlgo.Native
no apex
No Tensorflow
SPCONV_DEBUG_SAVE_PATH not found, you can specify SPCONV_DEBUG_SAVE_PATH as debug data save path to save debug data which can be attached in a issue.
Traceback (most recent call last):
  File "SpMiddleResNetFHDLargeKernel_forward.py", line 13, in <module>
    ret, multi_scale_voxel_features,_ = center_net(voxel_features, batch_dict, voxel_coords, batch_size, dense_shape)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dl-01/HDD/code/FocalsConv/CenterPoint/det3d/models/backbones/scn_largekernel.py", line 290, in forward
    x_conv1 = self.conv1(x)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/spconv/pytorch/modules.py", line 138, in forward
    input = module(input)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dl-01/HDD/code/FocalsConv/CenterPoint/det3d/models/backbones/scn_largekernel.py", line 181, in forward
    out = self.conv1(x)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dl-01/HDD/code/FocalsConv/CenterPoint/det3d/models/backbones/scn_largekernel.py", line 92, in forward
    x_conv_block = self.block(x_conv)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/spconv/pytorch/conv.py", line 755, in forward
    return self._conv_forward(self.training,
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/spconv/pytorch/conv.py", line 327, in _conv_forward
    out_features = Fsp.indice_subm_conv(
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 118, in decorate_fwd
    return fwd(*args, **kwargs)
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/spconv/pytorch/functional.py", line 327, in forward
    raise e
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/spconv/pytorch/functional.py", line 308, in forward
    return ops.indice_conv(features,
  File "/home/dl-01/HDD/anaconda3/envs/openpcdet/lib/python3.8/site-packages/spconv/pytorch/ops.py", line 860, in indice_conv
    ConvGemmOps.indice_conv(alloc, ext_mm, GEMM_CPP, ALL_WEIGHT_IS_KRSC,
RuntimeError: /tmp/pip-build-env-h7b1kdx5/overlay/lib/python3.8/site-packages/cumm/include/tensorview/tensor.h(770)
stride_valid assert faild. non-contiguous stride can't handled.
lda1049187465 commented 1 year ago

I have the same problem.Did you solve it?

yukang2017 commented 1 year ago

Sorry for the late reply. Would you please provide the version information, torch and spconv?

Karkers commented 9 months ago

相同的问题,torch1.10,cu113

Tream733 commented 5 months ago

I have the same problem. torch 1.12.0+cu116 ; spconv-cu116 2.3.6

Tream733 commented 5 months ago

Sorry for the late reply. Would you please provide the version information, torch and spconv? it's work. https://github.com/dvlab-research/LargeKernel3D/issues/11