OpenGVLab / DCNv4

[CVPR 2024] Deformable Convolution v4
https://arxiv.org/pdf/2401.06197.pdf
MIT License
514 stars 27 forks source link

作者您好,你的工作十分出色,但是我再使用的时候出现了运行时错误,RuntimeError: Not implemented on the CPU,目前不知道怎么解决,请问直接安装后还要进行别的操作吗? #68

Open xpbag opened 3 months ago

xpbag commented 3 months ago

output = ext.dcnv4_forward(*args) ^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Not implemented on the CPU Exception raised from dcnv4_forward at D:\learningJournal\Detection\YOLO\ultralytics-main\ultralytics\nn\DCNv4_op\src\dcnv4.h:82 (most recent call first): 00007FF85739366200007FF857393600 c10.dll!c10::Error::Error [ @ ] 00007FF85739311A00007FF8573930C0 c10.dll!c10::detail::torchCheckFail [ @ ] 00007FFF7500067B00007FFF74FF56E0 ext.cp311-win_amd64.pyd!c10::ivalue::Object::operator= [ @ ] 00007FFF7500A0DD00007FFF750007E0 ext.cp311-win_amd64.pyd!PyInit_ext [ @ ] 00007FFF7500A14400007FFF750007E0 ext.cp311-win_amd64.pyd!PyInit_ext [ @ ] 00007FFF74FFE01B00007FFF74FF56E0 ext.cp311-win_amd64.pyd!c10::ivalue::Object::operator= [ @ ] 00007FF85EBD42CC00007FF85EBD3550 python311.dll!PyCFunction_GetFlags [ @ ] 00007FF85EB8F67800007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FF85EC8D88400007FF85EC8D320 python311.dll!PyEval_GetFuncDesc [ @ ] 00007FF85EC8903F00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EB8F76D00007FF85EB8F730 python311.dll!PyFunction_Vectorcall [ @ ] 00007FF85EB8F50E00007FF85EB8F420 python311.dll!PyVectorcall_Function [ @ ] 00007FF85EB8F61F00007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FF85EC8D7BD00007FF85EC8D320 python311.dll!PyEval_GetFuncDesc [ @ ] 00007FF85EC8903F00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EB8F76D00007FF85EB8F730 python311.dll!PyFunction_Vectorcall [ @ ] 00007FF85EB8F50E00007FF85EB8F420 python311.dll!PyVectorcall_Function [ @ ] 00007FF85EB8F61F00007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FFF79C502C100007FFF79C3CF00 torch_python.dll!THPPointer::THPPointer [ @ ] 00007FF85EBD430600007FF85EBD3550 python311.dll!PyCFunction_GetFlags [ @ ] 00007FF85EB8F67800007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FF85EC8D88400007FF85EC8D320 python311.dll!PyEval_GetFuncDesc [ @ ] 00007FF85EC8903F00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EB8F76D00007FF85EB8F730 python311.dll!PyFunction_Vectorcall [ @ ] 00007FF85EB9154900007FF85EB91190 python311.dll!PyCell_Set [ @ ] 00007FF85EB91B5100007FF85EB91960 python311.dll!PyMethod_Self [ @ ] 00007FF85EB8F50E00007FF85EB8F420 python311.dll!PyVectorcall_Function [ @ ] 00007FF85EB8F61F00007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FF85EC8D7BD00007FF85EC8D320 python311.dll!PyEval_GetFuncDesc [ @ ] 00007FF85EC8903F00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EB8F76D00007FF85EB8F730 python311.dll!PyFunction_Vectorcall [ @ ] 00007FF85EB9154900007FF85EB91190 python311.dll!PyCell_Set [ @ ] 00007FF85EB91B5100007FF85EB91960 python311.dll!PyMethod_Self [ @ ] 00007FF85EB8F50E00007FF85EB8F420 python311.dll!PyVectorcall_Function [ @ ] 00007FF85EB8F61F00007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FF85EC8D7BD00007FF85EC8D320 python311.dll!PyEval_GetFuncDesc [ @ ] 00007FF85EC8903F00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EB8F76D00007FF85EB8F730 python311.dll!PyFunction_Vectorcall [ @ ] 00007FF85EB8F25400007FF85EB8F180 python311.dll!PyObject_FastCallDictTstate [ @ ] 00007FF85EB8F9F200007FF85EB8F950 python311.dll!PyObject_Call_Prepend [ @ ] 00007FF85EBFAAF400007FF85EBF72D0 python311.dll!PyType_Ready [ @ ] 00007FF85EB8F3D100007FF85EB8F2B0 python311.dll!PyObject_MakeTpCall [ @ ] 00007FF85EB8F59100007FF85EB8F570 python311.dll!PyObject_Vectorcall [ @ ] 00007FF85EC87F1A00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8346000007FF85EC833C0 python311.dll!PyEval_EvalCode [ @ ] 00007FF85EC7E24900007FF85EC73AB0 python311.dll!PyWarnings_Init [ @ ] 00007FF85EC7BCAE00007FF85EC73AB0 python311.dll!PyWarnings_Init [ @ ] 00007FF85EBD3F4800007FF85EBD3550 python311.dll!PyCFunction_GetFlags [ @ ] 00007FF85EB8F50E00007FF85EB8F420 python311.dll!PyVectorcall_Function [ @ ] 00007FF85EB8F61F00007FF85EB8F5D0 python311.dll!PyObject_Call [ @ ] 00007FF85EC8D88400007FF85EC8D320 python311.dll!PyEval_GetFuncDesc [ @ ] 00007FF85EC8903F00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EC8BD0E00007FF85EC83780 python311.dll!PyEval_EvalFrameDefault [ @ ] 00007FF85EB8F76D00007FF85EB8F730 python311.dll!PyFunction_Vectorcall [ @ ] 00007FF85EB8F01900007FF85EB8EF20 python311.dll!PyBytes_Repeat [ @ ] 00007FF85EB902EE00007FF85EB90100 python311.dll!PyObject_CallMethodId_SizeT [ @ ] 00007FF85EB9046400007FF85EB90400 python311.dll!PyObject_CallMethodObjArgs [ @ ] 00007FF85ECC13AE00007FF85ECC0B20 python311.dll!PyImport_ImportModuleNoBlock [ @ ]

BUG423 commented 1 month ago

哥们,这不是说的很明显了:RuntimeError: Not implemented on the CPU gpt翻译一下都不至于来问

xpbag commented 1 month ago

哥们,这不是说的很明显了:RuntimeError: Not implemented on the CPU gpt翻译一下都不至于来问

兄弟,我虽然菜但不至于不知道这个报错的原因,当时这个debug到后面还是解决不了

BUG423 commented 1 month ago

您好,我已经收到您的来信,非常感谢!祝好!

BUG423 commented 1 month ago

把你的tensor和model都移到同一个gpu设备上就行了,比如: if name == 'main':     # 预先创建 layer 并将其放在循环外部,避免每次循环重新创建     layer = DCNv4(channels=16, kernel_size=3, stride=1, pad=1, dilation=1, group=1).to("cuda:1")     tensor = torch.rand(1024, 1, 16).to("cuda:1")  # BLC     #nnd 现在是BLC     for i in range(512):         result = layer(tensor)         print(i, "result.shape:", result.shape)

------------------ 原始邮件 ------------------ 发件人: "OpenGVLab/DCNv4" @.>; 发送时间: 2024年10月11日(星期五) 上午10:29 @.>; @.**@.>; 主题: Re: [OpenGVLab/DCNv4] 作者您好,你的工作十分出色,但是我再使用的时候出现了运行时错误,RuntimeError: Not implemented on the CPU,目前不知道怎么解决,请问直接安装后还要进行别的操作吗? (Issue #68)

哥们,这不是说的很明显了:RuntimeError: Not implemented on the CPU gpt翻译一下都不至于来问

兄弟,我虽然菜但不至于不知道这个报错的原因,当时这个debug到后面还是解决不了

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

SWALLOWhajnal commented 5 days ago

你好,请问你使用DCNv4的时候遇到过内存越界问题吗 我这个报错很奇怪 RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions

我找到问题是出自:dcnv4_col2im_cuda.cuh这个文件

源码里面:int shm_size = sizeof(opmath_t) (G block_multiplier K) 2; if(!check_backward_warpp(d_stride, D)){ shm_size = sizeof(opmath_t) ((G block_multiplier K) 2 + G block_multiplier blockdimX * 3); } cudaFuncSetAttribute(kernel, cudaFuncAttributeMaxDynamicSharedMemorySize, shm_size);

kernel<<<num_blocks, num_threads, shm_size, stream>>>( value, p_offset, grad_output, G, D, Q, kernel_h, kernel_w, stride_h, stride_w, pad_h, pad_w, dilation_h, dilation_w, height_in, width_in, height_out, width_out, offset_scale, remove_center, block_multiplier, grad_im, grad_offset, padded_offset_dim);

cudaError_t err = cudaGetLastError(); if (err != cudaSuccess) { printf("error in dcnv4_im2col_cuda: %s\n", cudaGetErrorString(err)); printf("launch arguments: gridDim=(%d, %d, %d), blockDim=(%d, %d, %d), " "shm_size=%d\n\n", num_blocks.x, num_blocks.y, num_blocks.z, num_threads.x, num_threads.y, num_threads.z, shm_size); AT_ASSERTM(false, "kernel launch error"); } } 我是不知道你们是怎么修改的,我怎么改都爆illegal memory

error in dcnv4_im2col_cuda: an illegal memory access was encountered launch arguments: gridDim=(8192, 1, 1), blockDim=(12, 16, 1), shm_size=3456。 /DCNv4_op/src/cuda/dcnv4_col2im_cuda.cuh":470, please report a bug to PyTorch. kernel launch error