[cv2.dnn] Inconsistent results on ONNX model

apatsekin commented 1 year ago

System Information

OpenCV version: 4.6.0 Operating System / Platform: Ubuntu 20.04 Python 3.8.10 (default, Jun 22 2022, 20:18:18)

Detailed description

cv2.readNet().forward() produces wrong results using manually built OpenCV compared to the wheel published on PyPi - opencv-contrib-python-headless.

The output of .forward() is off by few pixels (5-10), confidence is off, sometimes significantly.

Concerns YOLOv5 (default model) converted to ONNX using their converting script.

When it produces correct results:

CPU inference with opencv-contrib-python-headless from PyPi
CUDA inference with manually built opencv-contrib-python-headless

When results are wrong:

CPU inference with manually built opencv-contrib-python-headless

So apparently my manual built is not using some BLAS/CPU library that official build is using? Still looks like a bug, given that inference does work and it does produce results that look sane (but in reality are off).

Steps to reproduce

Clone yolov5 repo
Convert model to ONNX using export.py
Run resulting yolov5s.onnx using cv2.dnn on sample image converted to png.
Inference should be on CPU using provided build (below)
Raw output from net.forward() will be wrong, while it's right with opencv binaries from pypi.

Build details:

General configuration for OpenCV 4.6.0 =====================================
  Version control:               4.6.0

  Extra modules:
    Location (extra):            /usr/src/opencv-python/opencv_contrib/modules
    Version control (extra):     4.6.0

  Platform:
    Timestamp:                   2022-12-02T17:57:18Z
    Host:                        Linux 5.15.0-1023-aws x86_64
    CMake:                       3.25.0
    CMake generator:             Ninja
    CMake build tool:            /usr/bin/ninja
    Configuration:               Release

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (15 files):         + SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (4 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (28 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (4 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      NO
    C++ standard:                11
    C++ Compiler:                /usr/bin/c++  (ver 9.4.0)
    C++ flags (Release):         -fsigned-char -ffast-math -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -fsigned-char -ffast-math -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /usr/bin/cc
    C flags (Release):           -fsigned-char -ffast-math -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -fsigned-char -ffast-math -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a   -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined  
    Linker flags (Debug):        -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a   -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined  
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:          /lib/openblas-base/libopenblas.so /usr/lib/x86_64-linux-gnu/libpng.so /usr/lib/x86_64-linux-gnu/libz.so Iconv::Iconv m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda/lib64 -L/usr/lib/x86_64-linux-gnu
    3rdparty dependencies:       libprotobuf ade ittnotify libjpeg-turbo libwebp libtiff libopenjp2 IlmImf ippiw ippicv

  OpenCV modules:
    To be built:                 barcode bioinspired core cudaarithm cudabgsegm cudacodec cudafilters cudaimgproc cudalegacy cudawarping cudev dnn dnn_objdetect dnn_superres fuzzy gapi hfs img_hash imgcodecs imgproc intensity_transform line_descriptor ml phase_unwrapping photo plot python3 quality reg tracking video videoio wechat_qrcode xphoto
    Disabled:                    calib3d features2d flann highgui objdetect world
    Disabled by dependency:      aruco bgsegm ccalib cudafeatures2d cudaobjdetect cudaoptflow cudastereo datasets dpm face mcc optflow rapid rgbd saliency shape stereo stitching structured_light superres surface_matching text videostab xfeatures2d ximgproc xobjdetect
    Unavailable:                 alphamat cvv freetype hdf java julia matlab ovis python2 sfm ts viz
    Applications:                -
    Documentation:               NO
    Non-free algorithms:         NO

  GUI: 
    VTK support:                 NO

  Media I/O: 
    ZLib:                        /usr/lib/x86_64-linux-gnu/libz.so (ver 1.2.11)
    JPEG:                        libjpeg-turbo (ver 2.1.2-62)
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         /usr/lib/x86_64-linux-gnu/libpng.so (ver 1.6.37)
    TIFF:                        build (ver 42 - 4.2.0)
    JPEG 2000:                   build (ver 2.4.0)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      NO
      avcodec:                   NO
      avformat:                  NO
      avutil:                    NO
      swscale:                   NO
      avresample:                NO
    GStreamer:                   NO
    v4l/v4l2:                    YES (linux/videodev2.h)

  Parallel framework:            pthreads

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Intel IPP:                   2020.0.0 Gold [2020.0.0]
           at:                   /usr/src/opencv-python/_skbuild/linux-x86_64-3.8/cmake-build/3rdparty/ippicv/ippicv_lnx/icv
    Intel IPP IW:                sources (2020.0.0)
              at:                /usr/src/opencv-python/_skbuild/linux-x86_64-3.8/cmake-build/3rdparty/ippicv/ippicv_lnx/iw
    VA:                          NO
    Lapack:                      YES (/lib/openblas-base/libopenblas.so)
    Eigen:                       NO
    Custom HAL:                  NO
    Protobuf:                    build (3.19.1)

  NVIDIA CUDA:                   YES (ver 11.6, CUFFT CUBLAS FAST_MATH)
    NVIDIA GPU arch:             35 37 50 52 60 61 70 75 80 86
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 8.4.0)

  OpenCL:                        YES (no extra features)
    Include path:                /usr/src/opencv-python/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 /usr/bin/python3 (ver 3.8.10)
    Libraries:                   /usr/lib/x86_64-linux-gnu/libpython3.8.so (ver 3.8.10)
    numpy:                       /tmp/pip-build-env-i7jl326l/overlay/lib/python3.8/site-packages/numpy/core/include (ver 1.17.3)
    install path:                python/cv2/python-3

  Python (for build):            /usr/bin/python3

  Install to:                    /usr/src/opencv-python/_skbuild/linux-x86_64-3.8/cmake-install
-----------------------------------------------------------------

Issue submission checklist

[X] I report the issue, it's not a question
[X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
[X] I updated to the latest OpenCV version and the issue is still there
[X] There is reproducer code and related data files (videos, images, onnx, etc)

apatsekin commented 1 year ago

As I figured, PyPi's wheel has all 3rd party libs (including openblas) stuffed in, they are unpacked to opencv_contrib_python_headless.libs during pip install. I'm not getting the same result when following this build instruction. The resulting wheel doesn't unpack opencv_contrib_python_headless.libs when installed. I guess that's where this issue comes from - discrepancy in usage of BLAS libs or sth like that.

skvark commented 1 year ago

Yes, the PyPI wheels are self-contained and include all the dependencies. This is achieved with auditwheel tool which modifies the wheel after the build. It will bundle the dependencies into the wheel package.

As you wrote, your local build is most likely using different versions of 3rd party libraries. To replicate the CI environment that is used to build the PyPI wheels, you can use the Dockerfiles provided here: https://github.com/opencv/opencv-python/tree/4.x/docker/manylinux2014

opencv / opencv-python