opencv / opencv

Open Source Computer Vision Library
https://opencv.org
Apache License 2.0
78.39k stars 55.74k forks source link

inference with onnx and opencv gives different results #24044

Closed LaurentBerger closed 1 year ago

LaurentBerger commented 1 year ago

System Information

General configuration for OpenCV 4.8.0-dev =====================================
  Version control:               4.8.0-64-g1f7025f028-dirty

  Extra modules:
    Location (extra):            C:/lib/opencv_contrib/modules
    Version control (extra):     4.8.0-4-gd89b2b9b

  Platform:
    Timestamp:                   2023-06-29T16:39:13Z
    Host:                        Windows 10.0.22621 AMD64
    CMake:                       3.26.1
    CMake generator:             Visual Studio 17 2022
    CMake build tool:            C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe
    MSVC:                        1935
    Configuration:               Debug Release

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (18 files):         + SSSE3 SSE4_1
      SSE4_2 (2 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (1 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (8 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (37 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (8 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      YES
    C++ standard:                11
    C++ Compiler:                C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe  (ver 19.35.32215.0)
    C++ flags (Release):         /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP  /MD /O2 /Ob2 /DNDEBUG
    C++ flags (Debug):           /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP  /MDd /Zi /Ob0 /Od /RTC1
    C Compiler:                  C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe
    C flags (Release):           /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP   /MD /O2 /Ob2 /DNDEBUG
    C flags (Debug):             /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP /MDd /Zi /Ob0 /Od /RTC1
    Linker flags (Release):      /machine:x64  /INCREMENTAL:NO
    Linker flags (Debug):        /machine:x64  /debug /INCREMENTAL
    ccache:                      NO
    Precompiled headers:         YES
    Extra dependencies:          cudart_static.lib nppc.lib nppial.lib nppicc.lib nppidei.lib nppif.lib nppig.lib nppim.lib nppist.lib nppisu.lib nppitc.lib npps.lib cublas.lib cudnn.lib cufft.lib -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/lib/x64
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 alphamat aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform java line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency sfm shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab viz wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 cvv freetype hdf julia matlab ovis python2
    Applications:                tests perf_tests examples apps
    Documentation:               doxygen python javadoc
    Non-free algorithms:         YES

  Windows RT support:            NO

  GUI:                           WIN32UI
    Win32 UI:                    YES
    OpenGL support:              YES (opengl32 glu32)
    VTK support:                 YES (ver 9.2.5)

  Media I/O:
    ZLib:                        optimized C:/install/zlib/lib/zlib.lib debug C:/install/zlib/lib/zlibd.lib (ver 1.2.13)
    JPEG:                        build-libjpeg-turbo (ver 2.1.3-62)
      SIMD Support Request:      YES
      SIMD Support:              NO
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         optimized C:/install/libpng/lib/libpng16.lib debug C:/install/libpng/lib/libpng16d.lib (ver 1.6.40)
    TIFF:                        build (ver 42 - 4.2.0)
    JPEG 2000:                   build (ver 2.5.0)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      YES (prebuilt binaries)
      avcodec:                   YES (58.134.100)
      avformat:                  YES (58.76.100)
      avutil:                    YES (56.70.100)
      swscale:                   YES (5.9.100)
      avresample:                YES (4.0.0)
    GStreamer:                   NO
    DirectShow:                  YES
    Media Foundation:            YES
      DXVA:                      YES

  Parallel framework:            Concurrency

  Other third-party libraries:
    Intel IPP:                   2021.8 [2021.8.0]
           at:                   C:/lib/build/opencv/3rdparty/ippicv/ippicv_win/icv
    Intel IPP IW:                sources (2021.8.0)
              at:                C:/lib/build/opencv/3rdparty/ippicv/ippicv_win/iw
    Lapack:                      YES (C:/Program Files (x86)/Intel/oneAPI/mkl/2023.0.0/lib/intel64/mkl_intel_lp64.lib C:/Program Files (x86)/Intel/oneAPI/mkl/2023.0.0/lib/intel64/mkl_sequential.lib C:/Program Files (x86)/Intel/oneAPI/mkl/2023.0.0/lib/intel64/mkl_core.lib)
    OpenVINO:                    YES (2022.3.0)
    Eigen:                       YES (ver ..)
    Custom HAL:                  NO
    Protobuf:                    build (3.19.1)
    Flatbuffers:                 builtin/3rdparty (23.5.9)
  NVIDIA CUDA:                   YES (ver 12.1, CUFFT CUBLAS)
    NVIDIA GPU arch:             86
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 8.8.0)

  OpenCL:                        YES (NVD3D11)
    Include path:                C:/lib/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 C:/Program Files/Python310/python.exe (ver 3.10.10)
    Libraries:                   optimized C:/Program Files/Python310/libs/python310.lib debug C:/Program Files/Python310/libs/python310_d.lib (ver 3.10.10)
    numpy:                       C:/Users/laurent/AppData/Roaming/Python/Python310/site-packages/numpy/core/include (ver 1.23.5)
    install path:                C:/Users/laurent/AppData/Roaming/Python/Python310/site-packages/cv2/python-3.10

  Python (for build):            C:/Program Files/Python310/python.exe

  Java:
    ant:                         C:/apache-ant-1.10.13/bin/ant.bat (ver 1.10.13)
    Java:                        NO
    JNI:                         C:/Program Files/Java/jdk-19/include C:/Program Files/Java/jdk-19/include/win32 C:/Program Files/Java/jdk-19/include
    Java wrappers:               YES (ANT)
    Java tests:                  YES

  Install to:                    C:/install/opencv
-----------------------------------------------------------------

Detailed description

pytorch and onnx gives same results opencv does not give good results

Results

ONNX RESULT [[4.446731 4.4490666 4.46463 4.4546375 4.4510665 4.456948 ] [4.4421244 4.4491835 4.4703193 4.460532 4.4576974 4.462741 ] [4.440228 4.4505563 4.4774194 4.4691973 4.4677935 4.470848 ] [4.435331 4.4522853 4.484742 4.480783 4.480365 4.4803877] [4.4323177 4.4522853 4.4912124 4.4923916 4.4934096 4.4903293]] OPENCV [[5.1145616 5.1145616 5.1145616 5.1145616 5.1145616 5.1145616] [5.115294 5.115294 5.115294 5.115294 5.115294 5.115294 ] [5.116197 5.116197 5.116197 5.116197 5.116197 5.116197 ] [5.1162567 5.1162567 5.1162567 5.1162567 5.1162567 5.1162567] [5.115356 5.115356 5.115356 5.115356 5.115356 5.115356 ]] 0.84839463 7.712089

Steps to reproduce

simplified model can be loaded here image is here

import onnx
import onnxruntime as rt
import numpy as np
import cv2 as cv

onnx_name = "AdaBins_kitti_sim.onnx"

image = cv.imread(cv.samples.findFile("classroom__rgb_00283.jpg"))
# DATA for ONNX and OPENCV 
blob = np.transpose(image.astype(np.float32), [2, 0, 1])/255
blob = blob.reshape((1,3, 480, 640))

# ONNX inference
sess = rt.InferenceSession(onnx_name) 
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
predonnx = sess.run([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ], 
                 {input_name: blob})
print("ONNX RESULT")
disparity_onnx = predonnx[1][0, 0, : ,:]
bins_onnx = predonnx[0]
print(disparity_onnx[100:105,25:43:3])

# OPENCV inference
net = cv.dnn.readNet(onnx_name)
net.setInput(blob)
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])
print("OPENCV")
disparity_opencv = pred_opencv[1][0, 0, : ,:]
bins_opencv = pred_opencv[0]
print(disparity_opencv[100:105,25:43:3])
print(np.mean((disparity_onnx-disparity_opencv)**2))
print(np.max((disparity_onnx-disparity_opencv)**2))

Issue submission checklist

LaurentBerger commented 1 year ago

@berak @sturkmen72 @crackwitz

Please Can you run this code? My conclusion is : disable multithreading make opencv model adabins works

simplified model can be loaded here image is here

import onnx
import onnxruntime as rt
import numpy as np
import cv2 as cv

onnx_name = "AdaBins_kitti_sim.onnx"

image = cv.imread(cv.samples.findFile("classroom__rgb_00283.jpg"))
# DATA for ONNX and OPENCV 
blob = np.transpose(image.astype(np.float32), [2, 0, 1])/255
blob = blob.reshape((1,3, 480, 640))

# ONNX inference
sess = rt.InferenceSession(onnx_name) 
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
predonnx = sess.run([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ], 
                 {input_name: blob})
print("ONNX RESULT")
disparity_onnx = predonnx[1][0, 0, : ,:]
bins_onnx = predonnx[0]
print(disparity_onnx[100:105,25:43:3])

# OPENCV inference
net = cv.dnn.readNet(onnx_name)
net.setInput(blob)
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])
print("OPENCV")
disparity_opencv = pred_opencv[1][0, 0, : ,:]
bins_opencv = pred_opencv[0]
print(disparity_opencv[100:105,25:43:3])
print("Quadratic error ", np.mean((disparity_onnx-disparity_opencv)**2))
print("Max error ", np.max((disparity_onnx-disparity_opencv)**2))
cv.setNumThreads(0)
print("DISABLE THREAD")
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])
print("OPENCV")
disparity_opencv = pred_opencv[1][0, 0, : ,:]
bins_opencv = pred_opencv[0]
print(disparity_opencv[100:105,25:43:3])
print("Quadratic error ", np.mean((disparity_onnx-disparity_opencv)**2))
print("Max error ", np.max((disparity_onnx-disparity_opencv)**2))
crackwitz commented 1 year ago

Please Can you run this code?

No thanks. I am not involved.

zihaomu commented 1 year ago

Hi @LaurentBerger, can you try to disable the winograd optimized before forward the net?

net = cv.dnn.readNet(onnx_name)
net.enableWinograd(false)
net.setInput(blob)
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])

Ref : https://docs.opencv.org/4.x/db/d30/classcv_1_1dnn_1_1Net.html#a14a87a7604c03ef4ff366672ee9bfcf2

LaurentBerger commented 1 year ago

Hi @zihaomu

new code:

onnx_name = "AdaBins_kitti_sim.onnx"

image = cv.imread(cv.samples.findFile("classroom__rgb_00283.jpg"))
# DATA for ONNX and OPENCV 
blob = np.transpose(image.astype(np.float32), [2, 0, 1])/255
blob = blob.reshape((1,3, 480, 640))

# ONNX inference
sess = rt.InferenceSession(onnx_name) 
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
predonnx = sess.run([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ], 
                 {input_name: blob})
disparity_onnx = predonnx[1][0, 0, : ,:]
bins_onnx = predonnx[0]

# OPENCV inference
cv.setNumThreads(32)
print("Multithreads for opencv")
net = cv.dnn.readNet(onnx_name)
net.setInput(blob)
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])
disparity_opencv = pred_opencv[1][0, 0, : ,:]
bins_opencv = pred_opencv[0]
# print(disparity_opencv[100:105,25:43:3])
print("Quadratic error ", np.mean((disparity_onnx-disparity_opencv)**2))
print("Max error ", np.max((disparity_onnx-disparity_opencv)**2))
cv.setNumThreads(1)
print("DISABLE THREAD or 1 thread")
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])
disparity_opencv = pred_opencv[1][0, 0, : ,:]
bins_opencv = pred_opencv[0]
# print(disparity_opencv[100:105,25:43:3])
print("Quadratic error ", np.mean((disparity_onnx-disparity_opencv)**2))
print("Max error ", np.max((disparity_onnx-disparity_opencv)**2))
# OPENCV inference
cv.setNumThreads(32)
net.enableWinograd(False)
print("Multithreads for opencv and no winograd")
net = cv.dnn.readNet(onnx_name)
net.setInput(blob)
pred_opencv = net.forward([sess.get_outputs()[0].name, 
                 sess.get_outputs()[1].name,
               ])
disparity_opencv = pred_opencv[1][0, 0, : ,:]
bins_opencv = pred_opencv[0]
print("Quadratic error ", np.mean((disparity_onnx-disparity_opencv)**2))
print("Max error ", np.max((disparity_onnx-disparity_opencv)**2))

and results

Multithreads for opencv
Quadratic error  0.84839463
Max error  7.712089
DISABLE THREAD or 1 thread
Quadratic error  2.48915e-10
Max error  2.5547706e-09
Multithreads for opencv and no winograd
Quadratic error  0.84839463
Max error  7.712089
LaurentBerger commented 1 year ago

May be something is wrong in my opencv version. A basic model with one node reduceSum (last node AdaBins_kitti_sim) and one input

import numpy as np
import onnx
import onnxsim
import onnx.reference
import cv2 as cv

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
shape_ini = data.shape
select_axes = np.array([0], dtype=np.int64)
keepdims = 1
print("input DATA ")
print(data)
#https://github.com/onnx/onnx/blob/main/docs/PythonAPIOverview.md
#https://onnx.ai/onnx/expect_onnxruntime.html
#https://onnx.ai/onnx/intro/python.html#initializer-default-value
axes = np.array([1], dtype=np.int64)
out1 = onnx.helper.make_tensor_value_info('out1', onnx.TensorProto.FLOAT, [None, None, None])
inp0 = onnx.helper.make_tensor_value_info('inp0', onnx.TensorProto.FLOAT, shape_ini)
axes = onnx.numpy_helper.from_array(select_axes[0], "axes")
node1 = onnx.helper.make_node( "ReduceSum", inputs=["inp0", "axes"], outputs=["out1"], keepdims=keepdims)
graph = onnx.helper.make_graph([node1], 'test_reducesum',  [inp0], [out1], [axes])
onnx_model = onnx.helper.make_model(graph)

feeds = {'inp0': data}
sess = onnx.reference.ReferenceEvaluator(onnx_model)
res_onnx = sess.run(None, feeds)
print("ONNX result")
print(res_onnx[0])
onnx_name = "testReduceSum"
print("Writting model")
with open(onnx_name + ".onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())
# print("Reading model")
# onnx_model = onnx.load(onnx_name+'.onnx')
# simplified_model, _ = onnxsim.simplify(onnx_model)
# onnx.save(simplified_model, onnx_name+'_sim.onnx')

cv.setNumThreads(0)
net = cv.dnn.readNet(onnx_name+'.onnx')
net.enableWinograd(False)
print("Set opencv input DATA ")
print(data)

net.setInput(data)
res_ocv = net.forward()
print("OPENCV result")
print(res_ocv)

results input DATA [[[ 1. 2.] [ 3. 4.]]

[[ 5. 6.] [ 7. 8.]]

[[ 9. 10.] [11. 12.]]] ONNX result [[[15. 18.] [21. 24.]]] Writting model Set opencv input DATA [[[ 1. 2.] [ 3. 4.]]

[[ 5. 6.] [ 7. 8.]]

[[ 9. 10.] [11. 12.]]] OPENCV result [[0. 0.]]

LaurentBerger commented 1 year ago

Finally I can reproduce on macOS

zihaomu commented 1 year ago

Hi @LaurentBerger, looks like the model link can not be reached. Can you attach the onnx file to this issue. I will try to reproduce this issue on my site.

LaurentBerger commented 1 year ago

simplified model can be loaded here image is here

how to convert model from pytorch it's here

zihaomu commented 1 year ago

Hi, I do not know why I can not reach this link:(http://www.traimaocv.fr/CoursTF/AdaBins_kitti_sim.onnx) Can you download the onnx model and attach it to this issue, that would be useful for me.

LaurentBerger commented 1 year ago

Model is too big. Hope this one is good https://drive.google.com/file/d/1O5yTir83UfYiNpDl0l9LZK8r_1H8DeGc/view?usp=sharing

LaurentBerger commented 1 year ago

@zihaomu I think something is weird in pallalel implementation of reduce node (Mean ,Sum, ...):

image

In this code I get result of node mul and reducemean and I compare output result for 1 thread and 32 threads :

import numpy as np
import cv2 as cv

onnx_name = "AdaBins_kitti_sim.onnx"

image = cv.imread(cv.samples.findFile("classroom__rgb_00283.jpg"))
# DATA OPENCV 
blob = np.transpose(image.astype(np.float32), [2, 0, 1])/255
blob = blob.reshape((1,3, 480, 640))

# OPENCV inference
cv.setNumThreads(32)
print("Multithreads for opencv")
net = cv.dnn.readNet(onnx_name)
net.setInput(blob)
l_name_ouput =[sess.get_outputs()[0].name, 
               sess.get_outputs()[1].name, 
               'onnx_node!/encoder/blocks.0/blocks.0.0/act1/Mul',
               'onnx_node!/encoder/blocks.0/blocks.0.0/se/ReduceMean'
               ]
pred_opencv32 = net.forward(l_name_ouput)
cv.setNumThreads(1)
print("DISABLE THREAD or 1 thread")
pred_opencv1 = net.forward(l_name_ouput)

idx = 2
print("Quadratic error for node ", l_name_ouput[idx], " : ", np.mean((pred_opencv32[idx] -  pred_opencv1[idx])**2))
print("Max error for  node ", l_name_ouput[idx], " : ", np.max((pred_opencv32[idx] -  pred_opencv1[idx])**2))
idx = 3
print("Quadratic error for node ", l_name_ouput[idx], " : ", np.mean((pred_opencv32[idx] -  pred_opencv1[idx])**2))
print("Max error for  node ", l_name_ouput[idx], " : ", np.max((pred_opencv32[idx] -  pred_opencv1[idx])**2))

Result is

Multithreads for opencv
DISABLE THREAD or 1 thread
Quadratic error for node  onnx_node!/encoder/blocks.0/blocks.0.0/act1/Mul  :  0.0
Max error for  node  onnx_node!/encoder/blocks.0/blocks.0.0/act1/Mul  :  0.0
Quadratic error for node  onnx_node!/encoder/blocks.0/blocks.0.0/se/ReduceMean  :  3.4234838
Max error for  node  onnx_node!/encoder/blocks.0/blocks.0.0/se/ReduceMean  :  13.81892