OpenGVLab / DCNv4

[CVPR 2024] Deformable Convolution v4
https://arxiv.org/pdf/2401.06197.pdf
MIT License
488 stars 27 forks source link

DCNv4循环256次时,将触发错误:error in dcnv4_im2col_cuda: an illegal memory access was encountered launch arguments: gridDim=(1600, 1, 1), blockDim=(16, 4, 8), shm_size=0 #78

Open BUG423 opened 4 days ago

BUG423 commented 4 days ago

如题,我在使用DCNv4构建backbone时,出现了错误: error in dcnv4_im2col_cuda: an illegal memory access was encountered launch arguments: gridDim=(1600, 1, 1), blockDim=(16, 4, 8), shm_size=0 这令我费解,因为这应该是cuda的问题。 我对输入本是(1,200,16),nlc格式 为了定位问题,我对nlc进行了大量更改,并且用循环来单轮测试 for i in range(1024): print(i)

输出DCNv4层结果

    output = dcnv4_layer(input_data,(1,50))

    # 输出结果的形状与数值
    print(f"输入形状:{input_data.shape}")
    print(f"输出形状:{output.shape}")
    # print(f"输出结果:{output}")

然后在测试过程中,我发现当i=255,即执行了256次时,必然触发错误: error in dcnv4_im2col_cuda: an illegal memory access was encountered launch arguments: gridDim=(1600, 1, 1), blockDim=(16, 4, 8), shm_size=0

非常抱歉,我没有能力解决该问题,如果有人能够解决,请联系我,感激不尽!

image

image

BUG423 commented 21 hours ago

为了验证该问题,我进行了多次验证,逐渐发现,该问题类似于一种累计。当我减小nchw的乘积,比如从(1024,64,1,16)减小到(1,64,1,1),可以执行到1021次。同时,我也用DCNv3进行了测试,也出现了同样的问题 image 非常希望能有人一起解决该问题!