DNN module fails to compile against cuDNN 9.0

cudawarped commented 3 months ago

System Information

OpenCV version: 4.x (09/02/2024) OS: Windows 11 Compiler: VS 2022 CUDA: 12.3 cuDNN: 9.0

Detailed description

Switching from cuDNN 8.9.7 to 9.0 results in the following build error

D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\csl\cudnn/recurrent.hpp(122): error C3861: 'cudnnSetRNNDescriptor_v6': identifier not found

when compiling the DNN module.

Full error trace

``` [372/491] Building CXX object modules\dnn\CMakeFiles\opencv_dnn.dir\Release\src\layers\recurrent_layers.cpp.obj FAILED: modules/dnn/CMakeFiles/opencv_dnn.dir/Release/src/layers/recurrent_layers.cpp.obj C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1436~1.325\bin\Hostx64\x64\cl.exe /nologo /TP -DCVAPI_EXPORTS -DCV_CUDA4DNN=1 -DCV_OCL4DNN=1 -DENABLE_PLUGINS -DHAVE_FLATBUFFERS=1 -DHAVE_PROTOBUF=1 -D_CRT_SECURE_NO_WARNINGS=1 -D_USE_MATH_DEFINES -D_VARIADIC_MAX=10 -D_WIN32_WINNT=0x0601 -D__OPENCV_BUILD=1 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DCMAKE_INTDIR=\"Release\" -ID:\build\opencv\cuda_12_3_dnn\3rdparty\ippicv\ippicv_win\icv\include -ID:\build\opencv\cuda_12_3_dnn\3rdparty\ippicv\ippicv_win\iw\include -ID:\repos\opencv\opencv\modules\dnn\src -ID:\repos\opencv\opencv\modules\dnn\include -ID:\build\opencv\cuda_12_3_dnn\modules\dnn -ID:\repos\opencv\contrib\modules\cudev\include -ID:\repos\opencv\opencv\modules\core\include -ID:\repos\opencv\opencv\modules\imgproc\include -ID:\repos\opencv\opencv\modules\dnn\misc\caffe -ID:\repos\opencv\opencv\modules\dnn\misc\tensorflow -ID:\repos\opencv\opencv\modules\dnn\misc\onnx -ID:\repos\opencv\opencv\modules\dnn\misc\tflite -ID:\repos\opencv\opencv\3rdparty\include\opencl\1.2 -ID:\repos\opencv\opencv\modules\ts\include -ID:\repos\opencv\opencv\modules\imgcodecs\include -ID:\repos\opencv\opencv\modules\videoio\include -ID:\repos\opencv\opencv\modules\highgui\include -external:ID:\build\opencv\cuda_12_3_dnn -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -external:ID:\repos\opencv\opencv\3rdparty\flatbuffers\include -external:ID:\repos\opencv\opencv\3rdparty\protobuf\src -external:W0 /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /FS /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /wd4244 /wd4267 /wd4018 /wd4355 /wd4800 /wd4251 /wd4996 /wd4146 /wd4305 /wd4127 /wd4100 /wd4512 /wd4125 /wd4389 /wd4510 /wd4610 /wd4702 /wd4456 /wd4457 /wd4065 /wd4310 /wd4661 /wd4506 /wd4125 /wd4267 /wd4127 /wd4244 /wd4512 /wd4702 /wd4456 /wd4510 /wd4610 /wd4800 /wd4701 /wd4703 /wd4505 /wd4458 /O2 /Ob2 /DNDEBUG /Zi -MD /showIncludes /Fomodules\dnn\CMakeFiles\opencv_dnn.dir\Release\src\layers\recurrent_layers.cpp.obj /Fdlib\Release\opencv_dnn490.pdb /FS -c D:\repos\opencv\opencv\modules\dnn\src\layers\recurrent_layers.cpp D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\csl\cudnn/recurrent.hpp(122): error C3861: 'cudnnSetRNNDescriptor_v6': identifier not found D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\csl\cudnn/recurrent.hpp(100): note: while compiling class template member function 'cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor::RNNDescriptor(const cv::dnn::cuda4dnn::csl::cudnn::Handle &,cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor::RNNMode,int,int,bool,const cv::dnn::cuda4dnn::csl::cudnn::DropoutDescriptor &)' with [ T=float ] D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../csl/tensor_ops.hpp(541): note: see the first reference to 'cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor::RNNDescriptor' in 'cv::dnn::cuda4dnn::csl::LSTM::LSTM' with [ T=float ] D:\repos\opencv\opencv\modules\dnn\src\layers\../cuda4dnn/primitives/recurrent_cells.hpp(48): note: see the first reference to 'cv::dnn::cuda4dnn::csl::LSTM::LSTM' in 'cv::dnn::cuda4dnn::LSTMOp::LSTMOp' with [ T=float ] D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../../op_cuda.hpp(196): note: see the first reference to 'cv::dnn::cuda4dnn::LSTMOp::LSTMOp' in 'cv::dnn::make_cuda_node' D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../csl/tensor_ops.hpp(511): note: see reference to class template instantiation 'cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor' being compiled with [ T=float ] D:\repos\opencv\opencv\modules\dnn\src\layers\../cuda4dnn/primitives/recurrent_cells.hpp(88): note: see reference to class template instantiation 'cv::dnn::cuda4dnn::csl::LSTM' being compiled with [ T=float ] D:\repos\opencv\opencv\modules\dnn\src\cuda4dnn\primitives\../../op_cuda.hpp(196): note: see reference to class template instantiation 'cv::dnn::cuda4dnn::LSTMOp' being compiled D:\repos\opencv\opencv\modules\dnn\src\layers\recurrent_layers.cpp(763): note: see reference to function template instantiation 'cv::Ptr cv::dnn::make_cuda_node(int,cv::dnn::cuda4dnn::csl::Stream &&,cv::dnn::cuda4dnn::csl::cudnn::Handle &&,cv::Mat &,cv::Mat &,cv::Mat &,cv::dnn::cuda4dnn::RNNConfiguration &)' being compiled [393/491] Building CXX object modules\dnn\CMakeFiles\opencv_dnn.dir\Release\src\layers\split_layer.cpp.obj ninja: build stopped: subcommand failed. ```

Steps to reproduce

cmake --build . --target opencv_dnn

Issue submission checklist

[X] I report the issue, it's not a question
[X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
[X] I updated to the latest OpenCV version and the issue is still there
[X] There is reproducer code and related data files (videos, images, onnx, etc)

pfmephisto commented 2 months ago

I have what I think is the same issue on Ubuntu 22.04.3 LTS error: ‘cudnnSetRNNDescriptor_v6’ was not declared in this scope; did you mean ‘cudnnSetRNNDescriptor_v8’? Tested on OpenCV 4.9.0 and 4.x dev branch.

cudawarped commented 2 months ago

error: ‘cudnnSetRNNDescriptor_v6’ was not declared in this scope; did you mean ‘cudnnSetRNNDescriptor_v8’?

No, although I suspect its the same issue if your using cuDNN 9.0. You can see my error in the original issue if you expand Full error trace.

pfmephisto commented 2 months ago

Indeed I am on cuDNN 9.0.0-1. It appears to me that it's the default version. I followed Nvidia's guide for installing using their sources, and cuDNN 9 is the only available version.

apt-cache search cudnn

``` ➤ apt-cache search cudnn nvidia-cudnn - NVIDIA CUDA Deep Neural Network library (install script) libcudnn8 - cuDNN runtime libraries libcudnn8-dev - cuDNN development libraries and headers libcudnn8-samples - cuDNN samples cudnn - NVIDIA CUDA Deep Neural Network library (cuDNN) cudnn9 - NVIDIA CUDA Deep Neural Network library (cuDNN) cudnn9-cuda-11-8 - NVIDIA cuDNN for CUDA 11.8 cudnn9-cuda-11 - NVIDIA cuDNN for CUDA 11 cudnn9-cuda-12-3 - NVIDIA cuDNN for CUDA 12.3 cudnn9-cuda-12 - NVIDIA cuDNN for CUDA 12 libcudnn9-cuda-11 - cuDNN runtime libraries for CUDA 11.8 libcudnn9-cuda-12 - cuDNN runtime libraries for CUDA 12.3 libcudnn9-dev-cuda-11 - cuDNN development headers and symlinks for CUDA 11.8 libcudnn9-dev-cuda-12 - cuDNN development headers and symlinks for CUDA 12.3 libcudnn9-samples - cuDNN samples libcudnn9-static-cuda-11 - cuDNN static libraries for CUDA 11.8 libcudnn9-static-cuda-12 - cuDNN static libraries for CUDA 12.3 ```

apt-cache show cudnn

``` ➤ apt-cache show cudnn Package: cudnn Version: 9.0.0-1 Architecture: amd64 Priority: optional Section: multiverse/devel Maintainer: cudatools Installed-Size: 7 Depends: cudnn9 (>= 9.0.0) Filename: ./cudnn_9.0.0-1_amd64.deb Size: 2414 MD5sum: f29f79064f8fd192b766dc4faabc2506 SHA1: 653caa201310e3bbaac2f647f0f441925aeaa0f0 SHA256: 1efa4db76754bb59b7c5c9f7a4b012127ec73f900d94624dea314e8e77303496 SHA512: 520f532cd525581720e34ad5f8925348412e68014de2c4fbe222810770dc5ab6252e6474bcc0b792b431e7f7d31a8ad0a64799a7b77f762eb5fc319e9d1a1220 Description: NVIDIA CUDA Deep Neural Network library (cuDNN) NVIDIA CUDA Deep Neural Network library (cuDNN) Description-md5: 0fdfd21e8870349f974c27dcb69de946 Package: cudnn Version: 9.0.0-1 Architecture: amd64 Priority: optional Section: multiverse/devel Maintainer: cudatools Installed-Size: 7 Depends: cudnn9 (>= 9.0.0) Filename: ./cudnn_9.0.0-1_amd64.deb Size: 2414 MD5sum: aa0503956dee8137cb73ffa7401dc8ca SHA1: c34a7b559b875b2136393b0a876eea0cd3ffe4cf SHA256: c7f1d687e9d9222298b06c7d5eccb83862a59cc2eba1441275fe9f4fb29bdc71 SHA512: f42a571497a71f17a016dde9d7e31d96ec9a7c8f2fc329d3761b85c19d2a4e3e2dbfd4ff1704c7963b750ebb2958c3dc623b8c302b8c4d8cd939cdb47fa68d96 Description: NVIDIA CUDA Deep Neural Network library (cuDNN) NVIDIA CUDA Deep Neural Network library (cuDNN) Description-md5: 0fdfd21e8870349f974c27dcb69de946 ```

apt list --installed | grep cudnn

``` ➤ apt list --installed | grep cudnn WARNING: apt does not have a stable CLI interface. Use with caution in scripts. cudnn9-cuda-12-3/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic] cudnn9-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic] cudnn9/unknown,now 9.0.0-1 amd64 [installed,automatic] cudnn/unknown,now 9.0.0-1 amd64 [installed] libcudnn9-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic] libcudnn9-dev-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic] libcudnn9-samples/unknown,unknown,now 9.0.0.312-1 all [installed,automatic] libcudnn9-static-cuda-12/unknown,unknown,now 9.0.0.312-1 amd64 [installed,automatic] ```

cudawarped commented 2 months ago

The older versions are available under archived releases.

pfmephisto commented 2 months ago

@cudawarped Thanks; cuDNN 8.9.7 fixed the issue for me.

RaceIsIm commented 2 months ago

Having same issue here but installing cuDNN 8.9.7 did not fix it for me 😔

cudawarped commented 2 months ago

@RaceIsIm Can you check your CMake configuration output to make sure you are using 8.9.7 and not still using 9.0, i.e.

-- cuDNN: YES (ver 8.9.7)

RaceIsIm commented 2 months ago

Yeah i realised that cudNN files are in cuda folder aswell, made a spare folder with the 8.9.7 files and am now compiling. if i have further errors I will post here but consider it fixed for now. thanks. (edit: after changing the dirs on cmake gui it all worked as intended)

lhlhth commented 2 months ago

Anybody have qustion about LNK2019 无法解析的外部符号 "public: void __cdecl cv::cuda::GpuMat::upload(class cv::debug_build_guard::_InputArray const &)" when you use opencv in your cmake project?

henryse commented 2 months ago

FYI have hit the same issue with FROM nvidia/cuda:12.3.1-devel-ubuntu22.04, -D CUDNN_VERSION=9.0 \ -D CUDA_ARCH_BIN=8.6 \ -D CUDA_ARCH_PTX=8.6 \ -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12 \ -D CUDNN_INCLUDE_DIR=/usr/include \ -D CUDNN_LIBRARY=/usr/lib/x86_64-linux-gnu/libcudnn.so \ -D OPENCV_DNN_CUDA=ON \ I'll try to switch to 8.9.7 and see what happens.

kramamurthi commented 2 months ago

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

jiapei100 commented 2 months ago

Same issue here: https://github.com/opencv/opencv/issues/25192

Ambarish-Ombrulla commented 2 months ago

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

Have you solved the issue?? If you have please mention what you have done

kramamurthi commented 2 months ago

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

Have you solved the issue?? If you have please mention what you have done

I went down to cuDNN 8.9.7 and that fixed this issue for me.

Ambarish-Ombrulla commented 2 months ago

I hit the same issue on Windows with Cuda 12,3 and cuDNN 9.0 with generator Visual Studio 16 2019

Have you solved the issue?? If you have please mention what you have done

I went down to cuDNN 8.9.7 and that fixed this issue for me.

What is the Opencv version used and can you please mention what flags have you enabled

cudawarped commented 2 months ago

What is the Opencv version used and can you please mention what flags have you enabled

Use the latest version of OpenCV with CUDA 12.3 and cuDNN 8.97 (currently neither CUDA 12.4 or cuDNN 9.0 are supported by OpenCV).

For flags see https://cudawarped.github.io/opencv-experiments/qmd/opencv_cuda_python_windows.html#building-opencv-with-cmake

mitchmahan commented 1 month ago

Seeing a similar issue following the directions on http://jamesbowley.co.uk/qmd/opencv_cuda_python_windows.html#building-opencv-with-cmake.

CUDA 12.3 and CUDNN 8.9.7

The errors below all refer to "recurrent_layers.cpp.obj" ?

My build gets 90% done and then fails.

[3138/3698] Linking CXX shared library bin\Release\opencv_world490.dll FAILED: bin/Release/opencv_world490.dll lib/Release/opencv_world490.lib C:\WINDOWS\system32\cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E vs_link_dll --intdir=modules\world\CMakeFiles\opencv_world.dir\Release --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\mt.exe --manifests -- C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1439~1.335\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\opencv_world.Release.rsp /out:bin\Release\opencv_world490.dll /implib:lib\Release\opencv_world490.lib /pdb:bin\Release\opencv_world490.pdb /dll /version:4.9 /machine:x64 /INCREMENTAL:NO /NODEFAULTLIB:libc /DEBUG && cd ." LINK: command "C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1439~1.335\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\opencv_world.Release.rsp /out:bin\Release\opencv_world490.dll /implib:lib\Release\opencv_world490.lib /pdb:bin\Release\opencv_world490.pdb /dll /version:4.9 /machine:x64 /INCREMENTAL:NO /NODEFAULTLIB:libc /DEBUG /MANIFEST:EMBED,ID=2" failed (exit code 1120) with the following output: Creating library lib\Release\opencv_world490.lib and object lib\Release\opencv_world490.exp LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnSetRNNDescriptor_v6 referenced in function "public: __cdecl cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float>::RNNDescriptor<float>(class cv::dnn::cuda4dnn::csl::cudnn::Handle const &,enum cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float>::RNNMode,int,int,bool,class cv::dnn::cuda4dnn::csl::cudnn::DropoutDescriptor const &)" (??0?$RNNDescriptor@M@cudnn@csl@cuda4dnn@dnn@cv@@QEAA@AEBVHandle@12345@W4RNNMode@012345@HH_NAEBVDropoutDescriptor@12345@@Z) recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnGetRNNWorkspaceSize referenced in function "unsigned __int64 __cdecl cv::dnn::cuda4dnn::csl::cudnn::getRNNWorkspaceSize<float>(class cv::dnn::cuda4dnn::csl::cudnn::Handle const &,class cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float> const &,int,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptorsArray<float> const &)" (??$getRNNWorkspaceSize@M@cudnn@csl@cuda4dnn@dnn@cv@@YA_KAEBVHandle@01234@AEBV?$RNNDescriptor@M@01234@HAEBV?$TensorDescriptorsArray@M@01234@@Z) recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnRNNForwardInference referenced in function "void __cdecl cv::dnn::cuda4dnn::csl::cudnn::LSTMForward<float>(class cv::dnn::cuda4dnn::csl::cudnn::Handle const &,class cv::dnn::cuda4dnn::csl::cudnn::RNNDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::cudnn::FilterDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptorsArray<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptor<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float const >,int,class cv::dnn::cuda4dnn::csl::cudnn::TensorDescriptorsArray<float> const &,class cv::dnn::cuda4dnn::csl::DevicePtr<float>,class cv::dnn::cuda4dnn::csl::DevicePtr<float>,class cv::dnn::cuda4dnn::csl::WorkspaceInstance)" (??$LSTMForward@M@cudnn@csl@cuda4dnn@dnn@cv@@YAXAEBVHandle@01234@AEBV?$RNNDescriptor@M@01234@AEBV?$FilterDescriptor@M@01234@V?$DevicePtr@$$CBM@1234@AEBV?$TensorDescriptorsArray@M@01234@3AEBV?$TensorDescriptor@M@01234@353H4V?$DevicePtr@M@1234@6VWorkspaceInstance@1234@@Z) bin\Release\opencv_world490.dll : fatal error LNK1120: 3 unresolved externals ninja: build stopped: subcommand failed.

cudawarped commented 1 month ago

and object lib\Release\opencv_world490.exp LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library recurrent_layers.cpp.obj : error LNK2019: unresolved external symbol cudnnSetRNNDescriptor_v6

The error looks to be the same, I suspect you need to clean your build directory after atempting to build with cuDNN 9.0. Check the NVIDIA output of the CMake configure step to confirm you are really using cuDNN 8.9.7. e.g.

> --   NVIDIA CUDA:                   YES (ver 12.3, CUFFT CUBLAS NVCUVID NVCUVENC)
> --     NVIDIA GPU arch:             50 52 60 61 70 75 80 86 89 90
> --     NVIDIA PTX archs:            90
> --
> --   cuDNN:                         YES (ver 8.9.7)

ZelboK commented 1 month ago

Out of boredom and curiosity I tried to update the code on my own to accommodate for the breaking changes. For now it seems to build with cudnn 9, I just need to update it to be compatible with CUDA 12.4 as well.

Will push a PR upwards when I'm done. I'm guessing this will need to be for OpenCV 5? @cudawarped This would be my first contribution to openCV.

cudawarped commented 1 month ago

@ZelboK That's great, if you subit your PR ontop of the 4.x branch the 5.x branch will get manually updated in due time.

josyulavt commented 3 weeks ago

Can confirm downgrading to cuDNN 8.97 worked for me( cuda 11.8, opencv 4.9 )

johnnynunez commented 3 weeks ago

Yes, In my case is compatible with cuda 12.2 and cudnn 8.9. Cudnn9 is the problem

jiapei100 commented 2 weeks ago

Hi, @johnnynunez @josyulavt

Didn't you meet my the following issue??? https://github.com/opencv/opencv_contrib/issues/3728

johnnynunez commented 2 weeks ago

Hi, @johnnynunez @josyulavt

Didn't you meet my the following issue??? opencv/opencv_contrib#3728

yes, I had this problem

jiapei100 commented 2 weeks ago

@johnnynunez

I just tried the solution provided https://github.com/opencv/opencv_contrib/issues/3690 , I mean:

template <int N, typename... P, typename... R, class... Op>
__device__ __forceinline__ void blockReduce(const tuple<P...>& smem,
                                            const tuple<R...>& val,
                                            uint tid,
                                            const tuple<Op...>& op)
{
    block_reduce_detail::Dispatcher<N>::reductor::template reduce<
        const tuple<P...>&,
        const tuple<R...>&,
        const tuple<Op...>&>(smem, val, tid, op);
}

But problem persists... Did you solve it?

ZelboK commented 2 weeks ago

I have a PR up for this but it seems like one of hte builds broke. I don't have time to fix this for a while but I am pretty sure cuda 12.4's thrust toolkit has some problems when it comes to the tuples.

If you're getting errors like

: error: incomplete type is not allowed
      static_assert((VecTraits<DstType>::cn == tuple_size<SrcPtrTuple>::value), "" " " "VecTraits<DstType>::cn == tuple_size<SrcPtrTuple>::value");

then be aware that there was a bug in calculating the fake tuple_size. That is fixed in CCCL's main however.

ZelboK commented 2 weeks ago

@miscco Could you comment please?

jiapei100 commented 2 weeks ago

@ZelboK Where is your PR? Let me try...

josyulavt commented 2 weeks ago

Hi, @johnnynunez @josyulavt

Didn't you meet my the following issue??? opencv/opencv_contrib#3728

Nope, I was using a lower version of opencv (4.9.0) and you are using opencv 5.x, that could've changed a few things

jiapei100 commented 2 weeks ago

@josyulavt I actually tried both 4.9 and 5.0 ... Both have similar issues...

josyulavt commented 2 weeks ago

@josyulavt I actually tried both 4.9 and 5.0 ... Both have similar issues...

Whats your cuda and cudnn versions?

jiapei100 commented 2 weeks ago

@josyulavt

cuda: 12.4 cudnn: 8.9.7 ( I was using 9.1.0, but now downgraded to 8.9.7)

josyulavt commented 2 weeks ago

cuDNN 8.97 worked for me( cuda 11.8, opencv 4.9 )

From the comments here, it looks like cuda 12.2 would work, however I used : cuDNN 8.97, cuda 11.8, opencv 4.9 for my setup which is compatible with onnx too. goodluck!

johnnynunez commented 1 week ago

anyone is working on cuda 12.4 update 1 and cudnn 9.1.1?

opencv / opencv