opencv / opencv

Open Source Computer Vision Library
https://opencv.org
Apache License 2.0
75.95k stars 55.62k forks source link

cv::cuda::reduce bug #6148

Closed Aravind-Suresh closed 8 years ago

Aravind-Suresh commented 8 years ago

Fixes https://github.com/Itseez/opencv/issues/6126.

Aravind-Suresh commented 8 years ago

Please tell whether this solves the problem or not.

Aravind-Suresh commented 8 years ago

I am getting the following error ( buildbot ):

[ RUN      ] OCL_ImgProc/AccumulateSquare.Mat/45
C:\builds_ocv\precommit_opencl\opencv\modules\imgproc\test\ocl\test_accumulate.cpp(145): error: Expected: (TestUtils::checkNorm2(dst, udst, _mask)) <= (1e-2), actual: 208.761 vs 0.01
Size: [3 x 3]

[  FAILED  ] OCL_ImgProc/AccumulateSquare.Mat/45, where GetParam() = ((CV_32F, CV_64F), Channels(3), true) (46 ms)

Can anyone tell me what it is?

Aravind-Suresh commented 8 years ago

I fixed that error. Please tell me whether this solves the problem or not.

Aravind-Suresh commented 8 years ago

Please tell me whether it fixes the problem or not.

Aravind-Suresh commented 8 years ago

So, a simpler fix will be to transpose dst and return. Should I do that, or should I modify the code, so that it works for a columnar matrix?

vinograd47 commented 8 years ago

I guess, it will be better to use dst.create(1, rows); and then call GpuMat::reshape method.

Aravind-Suresh commented 8 years ago

Okay. I will do that.

vinograd47 commented 8 years ago

Please fix failing tests:

[opencv_test_cudev] [----------] 4 tests from ReduceToColumn
[opencv_test_cudev] [ RUN      ] ReduceToColumn.Sum
[opencv_test_cudev] /home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/cudev/test/test_reduction.cu:235: Failure
[opencv_test_cudev] Matrices "dst_gold" and "dst" have different sizes : "dst_gold" [205x1] vs "dst" [1x205]
[opencv_test_cudev] [  FAILED  ] ReduceToColumn.Sum (1 ms)
[opencv_test_cudev] [ RUN      ] ReduceToColumn.Avg
[opencv_test_cudev] /home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/cudev/test/test_reduction.cu:254: Failure
[opencv_test_cudev] Matrices "dst_gold" and "dst" have different sizes : "dst_gold" [173x1] vs "dst" [1x173]
[opencv_test_cudev] [  FAILED  ] ReduceToColumn.Avg (0 ms)
[opencv_test_cudev] [ RUN      ] ReduceToColumn.Min
[opencv_test_cudev] /home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/cudev/test/test_reduction.cu:273: Failure
[opencv_test_cudev] Matrices "dst_gold" and "dst" have different sizes : "dst_gold" [217x1] vs "dst" [1x217]
[opencv_test_cudev] [  FAILED  ] ReduceToColumn.Min (1 ms)
[opencv_test_cudev] [ RUN      ] ReduceToColumn.Max
[opencv_test_cudev] /home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/cudev/test/test_reduction.cu:292: Failure
[opencv_test_cudev] Matrices "dst_gold" and "dst" have different sizes : "dst_gold" [268x1] vs "dst" [1x268]
[opencv_test_cudev] [  FAILED  ] ReduceToColumn.Max (2 ms)
[opencv_test_cudev] [----------] 4 tests from ReduceToColumn (4 ms total)
[opencv_test_cudaarithm] [ RUN      ] CUDA_Arithm/Reduce.Cols/1
[opencv_test_cudaarithm] unknown file: Failure
[opencv_test_cudaarithm] C++ exception with description "/home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/core/src/cuda_gpu_mat.cpp:179: error: (-13) The matrix is not continuous, thus its number of rows can not be changed in function reshape
[opencv_test_cudaarithm] " thrown in the test body.
[opencv_test_cudaarithm] [  FAILED  ] CUDA_Arithm/Reduce.Cols/1, where GetParam() = (GM107 CS1, 128x128, CV_8U, Channels(1), cv::REDUCE_SUM, sub matrix) (166 ms)
Aravind-Suresh commented 8 years ago

I think it is because GpuMat is not continuous. So I used createContinuous(rows, 1, dst.type(), dst) I think that will fix the failing tests.

vinograd47 commented 8 years ago

The same tests still fails. You need to update the tests itself, since currently they are targeted for previous code version (with 1 x rows matrix).

Please check ./opencv_test_cudev --gtest_filter=*ReduceToColumn* and ./opencv_test_cudaarithm --gtest_filter=*Reduce.Cols*.

Aravind-Suresh commented 8 years ago

All checks have passed. I didn't modify tests. How come none of the tests fail, as they are targeted for previous code version? Tell me what to do next.

alalek commented 8 years ago

All checks have passed

BTW, public buildbot doesn't run CUDA tests. You need to build and run them manually on your machine.

Aravind-Suresh commented 8 years ago

Ohh okay. I will do that. Can you tell me where I can find the test scripts?

alalek commented 8 years ago

There is command-line sample for test launch a few comments above

Aravind-Suresh commented 8 years ago

Please tell me whether this fixes the problem or not.

vinograd47 commented 8 years ago

There is still failures in ./opencv_test_cudaarithm --gtest_filter=*Reduce.Cols*. I think you need to modify the test and do not create ROI for output image.

https://github.com/Itseez/opencv/blob/master/modules/cudaarithm/test/test_reductions.cpp#L880

Aravind-Suresh commented 8 years ago

What is the use of createMat with useRoi? It just creates the mat and chooses ROI of the given size after creation and returns it.

Aravind-Suresh commented 8 years ago

Is this snippet fine? Or should I use false instead of useRoi.

https://gist.github.com/Aravind-Suresh/18e78576f29bb92e741b

vinograd47 commented 8 years ago

I think it will be better to remove createMat call for dst at all:

cv::cuda::GpuMat dst;
cv::cuda::reduce(loadMat(src, useRoi), dst, 1, reduceOp, dst_depth);
Aravind-Suresh commented 8 years ago

Okay. Should I do the same for CUDA_TEST_P(Reduce, Rows) ? Because I didn't modify any of reduce_to_row methods.

vinograd47 commented 8 years ago

No, only for CUDA_TEST_P(Reduce, Cols).

Aravind-Suresh commented 8 years ago

Please let me know if this fixes the problem.

vinograd47 commented 8 years ago

Now it fails with

[opencv_test_cudaarithm] [ RUN      ] CUDA_Arithm/Reduce.Cols/0
[opencv_test_cudaarithm] /home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/cudaarithm/test/test_reductions.cpp:886: Failure
[opencv_test_cudaarithm] Matrices "dst_gold" and "dst" have different sizes : "dst_gold" [1x128] vs "dst" [128x1]
[opencv_test_cudaarithm] [  FAILED  ] CUDA_Arithm/Reduce.Cols/0, where GetParam() = (GM107 CS1, 128x128, CV_8U, Channels(1), cv::REDUCE_SUM, whole matrix) (1 ms)
Aravind-Suresh commented 8 years ago

I thought it is a problem with createContinuous. So I used reshape instead. But I have no idea why the output sizes don't match. Tell me if I am thinking in the right direction.

vinograd47 commented 8 years ago

No, createContinuous is OK. I think the problem is in cv::cuda::reduce implementation. Please check https://github.com/Itseez/opencv/blob/master/modules/cudaarithm/src/cuda/reduce.cu file.

vinograd47 commented 8 years ago

Probably in that line : https://github.com/Itseez/opencv/blob/master/modules/cudaarithm/src/cuda/reduce.cu#L140

Aravind-Suresh commented 8 years ago

Ohh yeah. Found that. The functions are improperly wrapped. I will change that and push.

Aravind-Suresh commented 8 years ago

I am getting a build fail in Win 10 x64 VS2015. Please tell me what is the problem with my code.

alalek commented 8 years ago

This failure is not related to this patch

Aravind-Suresh commented 8 years ago

Ohh okay. Then what should I do? Should I leave it as it is and it will get fixed later?

alalek commented 8 years ago

This build was restarted and completed successfully.

Aravind-Suresh commented 8 years ago

Thanks :)

vinograd47 commented 8 years ago

Now it fails in sanity checks (./opencv_perf_cudaarithm --gtest_filter=*Reduce* --perf_min_samples=1 --perf_force_samples=1 --perf_verify_sanity):

[opencv_perf_cudaarithm] [ RUN      ] Sz_Depth_Cn_Code_Dim_Reduce.Reduce/1
[opencv_perf_cudaarithm] /home/jenkins/workspace/OpenCV/3.0/Ubuntu-12.04-x64-Release-Shared-Package-WithTests/opencv/modules/ts/src/ts_perf.cpp:368: Failure
[opencv_perf_cudaarithm] Value of: actual.size.p[1]
[opencv_perf_cudaarithm]   Actual: 1
[opencv_perf_cudaarithm] Expected: expect_cols
[opencv_perf_cudaarithm] Which is: 720
[opencv_perf_cudaarithm] Argument "gpu_dst" has unexpected number of columns
[opencv_perf_cudaarithm] 
[opencv_perf_cudaarithm] params    = (1280x720, CV_8U, Gray, REDUCE_SUM, Cols)
[opencv_perf_cudaarithm] termination reason:  reached maximum number of iterations
[opencv_perf_cudaarithm] bytesIn   =     921600
[opencv_perf_cudaarithm] bytesOut  =          0
[opencv_perf_cudaarithm] samples   =          1
[opencv_perf_cudaarithm] outliers  =          0
[opencv_perf_cudaarithm] frequency = 1000000000
[opencv_perf_cudaarithm] min       =    7437438 = 7.44ms
[opencv_perf_cudaarithm] median    =    7437438 = 7.44ms
[opencv_perf_cudaarithm] gmean     =    7437438 = 7.44ms
[opencv_perf_cudaarithm] gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
[opencv_perf_cudaarithm] mean      =    7437438 = 7.44ms
[opencv_perf_cudaarithm] stddev    =          0 = 0.00ms
[opencv_perf_cudaarithm] [  FAILED  ] Sz_Depth_Cn_Code_Dim_Reduce.Reduce/1, where GetParam() = (1280x720, CV_8U, Gray, REDUCE_SUM, Cols) (26 ms)

I think it will be better to reshape the output image back to 1 row matrix before sanity check: https://github.com/Itseez/opencv/blob/master/modules/cudaarithm/perf/perf_reductions.cpp#L371

Aravind-Suresh commented 8 years ago

Please let me know if this works fine. Or should I change anything else?

vinograd47 commented 8 years ago

Now it works OK. Thank you for your contribution!

vinograd47 commented 8 years ago

:+1:

Aravind-Suresh commented 8 years ago

Okay. Thanks for your guidance :)