opencv / opencv

Open Source Computer Vision Library
https://opencv.org
Apache License 2.0
78.56k stars 55.77k forks source link

OpenCV CUDA gives low FPS #21548

Closed kadirtunc closed 2 years ago

kadirtunc commented 2 years ago

I am using very basic yolov3 object detection algorithm. Here are my files. It is about traffic signs. You can test it with stop sign etc. (Not all signs are included) https://www.mediafire.com/file/uqj6mw5tbauagq2/yolov3_object_detection.zip/file Here are my files.

Here are my opencv getBuildInformation output `General configuration for OpenCV 4.5.2 ===================================== Version control: unknown

Extra modules: Location (extra): /home/rota/Downloads/opencv_contrib-4.5.2/modules Version control (extra): unknown

Platform: Timestamp: 2022-01-31T20:33:31Z Host: Linux 5.13.0-27-generic x86_64 CMake: 3.16.3 CMake generator: Unix Makefiles CMake build tool: /usr/bin/make Configuration: RELEASE

CPU/HW features: Baseline: SSE SSE2 SSE3 requested: SSE3 Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX SSE4_1 (17 files): + SSSE3 SSE4_1 SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX AVX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX AVX2 (31 files): + SSSE3 Sis:pr is:open SE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX512_SKX (7 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

C/C++: Built as dynamic libs?: YES C++ standard: 11 C++ Compiler: /usr/bin/c++ (ver 9.3.0) C++ flags (Release): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG C++ flags (Debug): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG C Compiler: /usr/bin/cc C flags (Release): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG C flags (Debug): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG Linker flags (Release): -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,--gc-sections -Wl,--as-needed
Linker flags (Debug): -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,--gc-sections -Wl,--as-needed
ccache: NO Precompiled headers: NO Extra dependencies: m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda-11.2/lib64 -L/usr/lib/x86_64-linux-gnu 3rdparty dependencies:

OpenCV modules: To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto Disabled: cudacodec world Disabled by dependency: - Unavailable: alphamat cnn_3dobj cvv hdf java julia matlab ovis python2 sfm viz Applications: tests perf_tests apps Documentation: NO Non-free algorithms: YES

GUI: GTK+: YES (ver 3.24.20) GThread : YES (ver 2.64.6) GtkGlExt: NO OpenGL support: NO VTK support: NO

Media I/O: ZLib: /usr/lib/x86_64-linux-gnu/libz.so (ver 1.2.11) JPEG: /usr/lib/x86_64-linux-gnu/libjpeg.so (ver 80) WEBP: build (ver encoder: 0x020f) PNG: /usr/lib/x86_64-linux-gnu/libpng.so (ver 1.6.37) TIFF: /usr/lib/x86_64-linux-gnu/libtiff.so (ver 42 / 4.1.0) JPEG 2000: build (ver 2.4.0) OpenEXR: build (ver 2.3.0) HDR: YES SUNRASTER: YES PXM: YES PFM: YES

Video I/O: DC1394: NO FFMPEG: NO avcodec: NO avformat: NO avutil: NO swscale: NO avresample: NO GStreamer: YES (1.16.2) v4l/v4l2: YES (linux/videodev2.h)

Parallel framework: pthreads

Trace: YES (with Intel ITT)

Other third-party libraries: Intel IPP: 2020.0.0 Gold [2020.0.0] at: /home/rota/Downloads/opencv-4.5.2/build/3rdparty/ippicv/ippicv_lnx/icv Intel IPP IW: sources (2020.0.0) at: /home/rota/Downloads/opencv-4.5.2/build/3rdparty/ippicv/ippicv_lnx/iw VA: NO Lapack: NO Eigen: NO Custom HAL: NO Protobuf: build (3.5.1)

NVIDIA CUDA: YES (ver 11.2, CUFFT CUBLAS FAST_MATH) NVIDIA GPU arch: 86 NVIDIA PTX archs:

cuDNN: YES (ver 8.1.1)

OpenCL: YES (no extra features) Include path: /home/rota/Downloads/opencv-4.5.2/3rdparty/include/opencl/1.2 Link libraries: Dynamic load

Python 3: Interpreter: /usr/bin/python3 (ver 3.8.10) Libraries: /usr/lib/x86_64-linux-gnu/libpython3.8.so (ver 3.8.10) numpy: /usr/local/lib/python3.8/dist-packages/numpy/core/include (ver 1.22.1) install path: /usr/lib/python3/dist-packages/cv2/python-3.8

Python (for build): /usr/bin/python2.7

Java:
ant: NO JNI: NO Java wrappers: NO Java tests: NO

Install to: /usr/local -----------------------------------------------------------------`

kadirtunc commented 2 years ago

I solved the problem. This was very basic problem sorry for that. In the objectDetect function I ret the weight file every single time. I moved readNet part at the outside of the class, My FPS increases up to 80~ FPS.