opencv / opencv

Open Source Computer Vision Library
Apache License 2.0
78.56k stars 55.77k forks source link

OpenCV CUDA gives low FPS #21548

Closed kadirtunc closed 2 years ago

kadirtunc commented 2 years ago

I am using very basic yolov3 object detection algorithm. Here are my files. It is about traffic signs. You can test it with stop sign etc. (Not all signs are included) Here are my files.

Here are my opencv getBuildInformation output `General configuration for OpenCV 4.5.2 ===================================== Version control: unknown

Extra modules: Location (extra): /home/rota/Downloads/opencv_contrib-4.5.2/modules Version control (extra): unknown

Platform: Timestamp: 2022-01-31T20:33:31Z Host: Linux 5.13.0-27-generic x86_64 CMake: 3.16.3 CMake generator: Unix Makefiles CMake build tool: /usr/bin/make Configuration: RELEASE

CPU/HW features: Baseline: SSE SSE2 SSE3 requested: SSE3 Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX SSE4_1 (17 files): + SSSE3 SSE4_1 SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX AVX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX AVX2 (31 files): + SSSE3 Sis:pr is:open SE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX512_SKX (7 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

C/C++: Built as dynamic libs?: YES C++ standard: 11 C++ Compiler: /usr/bin/c++ (ver 9.3.0) C++ flags (Release): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG C++ flags (Debug): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG C Compiler: /usr/bin/cc C flags (Release): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG C flags (Debug): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG Linker flags (Release): -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,--gc-sections -Wl,--as-needed
Linker flags (Debug): -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,--gc-sections -Wl,--as-needed
ccache: NO Precompiled headers: NO Extra dependencies: m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda-11.2/lib64 -L/usr/lib/x86_64-linux-gnu 3rdparty dependencies:

OpenCV modules: To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto Disabled: cudacodec world Disabled by dependency: - Unavailable: alphamat cnn_3dobj cvv hdf java julia matlab ovis python2 sfm viz Applications: tests perf_tests apps Documentation: NO Non-free algorithms: YES

GUI: GTK+: YES (ver 3.24.20) GThread : YES (ver 2.64.6) GtkGlExt: NO OpenGL support: NO VTK support: NO

Media I/O: ZLib: /usr/lib/x86_64-linux-gnu/ (ver 1.2.11) JPEG: /usr/lib/x86_64-linux-gnu/ (ver 80) WEBP: build (ver encoder: 0x020f) PNG: /usr/lib/x86_64-linux-gnu/ (ver 1.6.37) TIFF: /usr/lib/x86_64-linux-gnu/ (ver 42 / 4.1.0) JPEG 2000: build (ver 2.4.0) OpenEXR: build (ver 2.3.0) HDR: YES SUNRASTER: YES PXM: YES PFM: YES

Video I/O: DC1394: NO FFMPEG: NO avcodec: NO avformat: NO avutil: NO swscale: NO avresample: NO GStreamer: YES (1.16.2) v4l/v4l2: YES (linux/videodev2.h)

Parallel framework: pthreads

Trace: YES (with Intel ITT)

Other third-party libraries: Intel IPP: 2020.0.0 Gold [2020.0.0] at: /home/rota/Downloads/opencv-4.5.2/build/3rdparty/ippicv/ippicv_lnx/icv Intel IPP IW: sources (2020.0.0) at: /home/rota/Downloads/opencv-4.5.2/build/3rdparty/ippicv/ippicv_lnx/iw VA: NO Lapack: NO Eigen: NO Custom HAL: NO Protobuf: build (3.5.1)


cuDNN: YES (ver 8.1.1)

OpenCL: YES (no extra features) Include path: /home/rota/Downloads/opencv-4.5.2/3rdparty/include/opencl/1.2 Link libraries: Dynamic load

Python 3: Interpreter: /usr/bin/python3 (ver 3.8.10) Libraries: /usr/lib/x86_64-linux-gnu/ (ver 3.8.10) numpy: /usr/local/lib/python3.8/dist-packages/numpy/core/include (ver 1.22.1) install path: /usr/lib/python3/dist-packages/cv2/python-3.8

Python (for build): /usr/bin/python2.7

ant: NO JNI: NO Java wrappers: NO Java tests: NO

Install to: /usr/local -----------------------------------------------------------------`

kadirtunc commented 2 years ago

I solved the problem. This was very basic problem sorry for that. In the objectDetect function I ret the weight file every single time. I moved readNet part at the outside of the class, My FPS increases up to 80~ FPS.