/tmp/build_opencv/opencv_contrib/ is not available in docker image

kukirokuk commented 3 years ago

I try to use cuda cv2.cuda_CascadeClassifier, my application based on your docker image here https://hub.docker.com/r/mdegans/tegra-opencv But I receive error

cv2.error: OpenCV(4.5.1) /tmp/build_opencv/opencv_contrib/modules/cudaobjdetect/src/cascadeclassifier.cpp:155: error: (-217:Gpu API call) NCV Assertion Failed: NcvStat=4, file=/tmp/build_opencv/opencv_contrib/modules/cudalegacy/src/cuda/NCVHaarObjectDetection.cu, line=2363 in function 'NCVDebugOutputHandler'

When I check /tmp/ directory inside this docker image, its empty. Cat it be the reason why app is failing?

kukirokuk commented 3 years ago

I have installed opencv via build_opencv.sh script but tests fails. What can be wrong?

mdegans commented 3 years ago

I have installed opencv via build_opencv.sh script but tests fails. What can be wrong?

It's related to #43

Test data is missing. I will address it in the next release so that the test functionality works.

mdegans commented 3 years ago

I try to use cuda cv2.cuda_CascadeClassifier, my application based on your docker image here https://hub.docker.com/r/mdegans/tegra-opencv But I receive error
cv2.error: OpenCV(4.5.1) /tmp/build_opencv/opencv_contrib/modules/cudaobjdetect/src/cascadeclassifier.cpp:155: error: (-217:Gpu API call) NCV Assertion Failed: NcvStat=4, file=/tmp/build_opencv/opencv_contrib/modules/cudalegacy/src/cuda/NCVHaarObjectDetection.cu, line=2363 in function 'NCVDebugOutputHandler'
When I check /tmp/ directory inside this docker image, its empty.

Anything not strictly needed at runtime is deleted, including any build files. It's by design. When I fix #43 I will address this as well so tests can be run.

Cat it be the reason why app is failing?

Nope. It's just saying "the error is here in the source" and if it's still around you can check the .cpp, but you can also check that file on GitHub.

Possibly the container does not have permission to access the GPU. What's your docker run command?

kukirokuk commented 3 years ago

Test data is missing. I will address it in the next release so that the test functionality works.

100 % of my test failed, is it ok or some of them should only fail?

Possibly the container does not have permission to access the GPU. What's your docker run command?

I run my app in docker-compose and there I have runtime: nvidia

When I try to run ./deviceQuery previously copied to container from host dir /usr/local/cuda/samples/ result is: `./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35 -> CUDA driver version is insufficient for CUDA runtime version Result = FAIL`

kukirokuk commented 3 years ago

Should there be cuda 10_2 instead 10_0 in path? https://github.com/mdegans/nano_build_opencv/blob/docker/build_opencv.sh#L49

mdegans commented 3 years ago

Should there be cuda 10_2 instead 10_0 in path? https://github.com/mdegans/nano_build_opencv/blob/docker/build_opencv.sh#L49

Yup, that's a bug and could prevent buildling something like pycuda. Thanks for spotting that.

mdegans commented 3 years ago

Ack. looks like I edited your comment instead of hitting quote reply. Sorry. Not much sleep last night.

I run my app in docker-compose and there I have runtime: nvidia

I haven't tried compose, but if the options are the same as with docker run it should work fine. Try the example launch like this and let me know if it works:

$ sudo docker run --user $(id -u):$(cut -d: -f3 < <(getent group video)) --runtime nvidia -it --rm mdegans/tegra-opencv:latest
[sudo] password for anzu:
I have no name!@3aa6941ec9c5:/usr/local/src/build_opencv$ python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.cuda.printCudaDeviceInfo(0)
*** CUDA Device Query (Runtime API) version (CUDART static linking) ***

Device count: 1

Device 0: "NVIDIA Tegra X1"
  CUDA Driver Version / Runtime Version          10.20 / 10.20
  CUDA Capability Major/Minor version number:    5.3
  Total amount of global memory:                 3956 MBytes (4148314112 bytes)
  GPU Clock Speed:                               0.92 GHz
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65536), 3D=(4096,4096,4096)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and execution:                 Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           0 / 0
  Compute Mode:
      Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 10.20, CUDA Runtime Version = 10.20, NumDevs = 1

--user should not be required. It's just a suggest you not run any containers as root. Root inside the container is the same as root outside the container. In this case the uid could be some user dedicated to storing files for this container if you are using volumes/bind mounts.

kukirokuk commented 3 years ago

I haven't tried compose, but if the options are the same as with docker run it should work fine. Try the example launch like this and let me know if it works:

I have the same result as you. ` Device count: 1

Device 0: "NVIDIA Tegra X1" CUDA Driver Version / Runtime Version 10.20 / 10.20 CUDA Capability Major/Minor version number: 5.3 Total amount of global memory: 3964 MBytes (4156694528 bytes) GPU Clock Speed: 0.92 GHz Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65536), 3D=(4096,4096,4096) Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: Yes Support host page-locked memory mapping: Yes Concurrent kernel execution: Yes Alignment requirement for Surfaces: Yes Device has ECC support enabled: No Device is using TCC driver mode: No Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 0 / 0 Compute Mode: Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.20, CUDA Runtime Version = 10.20, NumDevs = 1 `

The most interesting part is that cuda haarcascade smile detector is working but others like face or eyes detector doesn`t work:

cv2.error: OpenCV(4.5.1) /tmp/build_opencv/opencv_contrib/modules/cudaobjdetect/src/cascadeclassifier.cpp:155: error: (-217:Gpu API call) NCV Assertion Failed: cudaError_t=702, file=/tmp/build_opencv/opencv_contrib/modules/cudale gacy/src/cuda/NCVHaarObjectDetection.cu, line=1157 in function 'NCVDebugOutputHandler'

Looks like the error is in this line:

https://github.com/opencv/opencv_contrib/blob/master/modules/cudaobjdetect/src/cascadeclassifier.cpp#L155

I don't know much in C++ language, maybe you can tell me what's the problem?

mdegans commented 3 years ago

@kukirokuk

cudaError_t 702 means it timed out. Have you tried the same version outside docker?

kukirokuk commented 3 years ago

@kukirokuk

cudaError_t 702 means it timed out. Have you tried the same version outside docker?

Yep, same error. Only smile classifier works properly.

mdegans / nano_build_opencv

/tmp/build_opencv/opencv_contrib/ is not available in docker image #60