Closed gititgo closed 1 year ago
Could you point to cross-complier and may be some notes how to build the code. QEMU or some other emulator will be very useful too. Also I recommend you to take a look on cvRound implementation. It's used everywhere and efficient rounding affects performance a lot: https://github.com/opencv/opencv/blob/9aa647068b2eba4a34462927b1878353dfd3df69/modules/core/include/opencv2/core/fast_math.hpp#L200
Could you point to cross-complier and may be some notes how to build the code. QEMU or some other emulator will be very useful too.
We are preparing such QEMU, but it is not finished yet. We'll provide it as soon as it's available.
The cross-complier (build Loongarch on x86) and cmake config file is here: git clone https://gitee.com/wenux/cross-compiler-la-on-x86.git
How to use it: (1)tar -xvf toolchain-loongarch64-linux-gnu-cross-830-rc1.0-2022-04-22a.tar.xz (2)Set "tools" in cmake config file (la64_linux_setup.cmake) to your real path; (3)cmake with config file: cmake -DCMAKE_TOOLCHAIN_FILE=path/to/your/la64_linux_setup.cmake -DCPU_BASELINE=LASX -DBUILD_OPENJPEG=ON ../ (4)make
Is QEMU necessary for this PR ?
It'll be great to have QEMU to run tests.
It'll be great to have QEMU to run tests.
Is a 3A5000(Loongarch64) environment ok? Because QEMU may take a long time.
How long is it going to take to run all unit tests in QEMU? It is recommanded to have a CI pipeline to test code automatically. Please at least provide something that we can perform tests.
How long is it going to take to run all unit tests in QEMU? It is recommanded to have a CI pipeline to test code automatically. Please at least provide something that we can perform tests.
QEMU is under development, but I haven't got the exact time. We can provide a remote Loongarch environment. Can this be used for automated testing ?
Yes, but we are also expecting an environment which we can use for testing if your remote Loongarch environment is expired to us. So it is recommanded to have QEMU to run tests on Loongarch.
The provided git link (git clone https://gitee.com/wenux/cross-compiler-la-on-x86.git) is protected by username and password.
The provided git link (git clone https://gitee.com/wenux/cross-compiler-la-on-x86.git) is protected by username and password.
Sorry,it‘s ok now.
opencv/modules/core/src/parallel_impl.cpp:63:5: warning: #warning "Can't detect 'pause' (CPU-yield) instruction on the target platform. Specify CV_PAUSE() definition via compiler flags." [-Wcpp]
# warning "Can't detect 'pause' (CPU-yield) instruction on the target platform. Specify CV_PAUSE() definition via compiler flags."
Also the best way to enable OpenCV cross-compilation for the new architecture is to place toolchain file to opencv/platforms/linux/loongson
. I propose to replace cmake local variable tools with CACHE variable with meaningful name to set it from command line. See opencv/platforms/linux/riscv64-clang.toolchain.cmake
as example.
@gititgo Thanks a lot for the toolchain. I was able to build the source code.
@gititgo, thank you for the contribution! Please, mark the proper items in the checklist:
* I agree to contribute to the project under Apache 2 License.
* To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
without this confirmation we cannot merge your code into OpenCV
without this confirmation we cannot merge your code into OpenCV
OK,marked
@fengyuentau We prepared a remote LoongArch PC for test. The IP and password have been emailed to you.
I run accuracy tests (./bin/opencvtest*) on your LoongArch PC and only 5 modules (core, flann, highgui, ml and videoio) did not fail at Segmentation fault. The CMake command I used is cmake -B build -D CPU_BASELINE=LASX opencv
. Did I miss something?
Please -DBUILD_PNG=ON as there may be a bug in libpng in our system.
Rebuilt with CMake option -DBUILD_PNG=ON
and no more segmentation faults. But Calib3d_StereoBM.regression
failed as discussed above.
Yes,there are two known issues: 1、Calib3d_StereoBM.regression: may be the compiler optimization issue as DEBUG version is OK. 2、videoio/videocapture_acceleration.read(ffmpeg): ffmpeg has a bug in our system (optimized on loongarch) as it‘s OK when we use Open source ffmpeg.
Yes,there are two known issues: 1、Calib3d_StereoBM.regression: may be the compiler optimization issue as DEBUG version is OK. 2、videoio/videocapture_acceleration.read(ffmpeg): ffmpeg has a bug in our system (optimized on loongarch) as it‘s OK when we use Open source ffmpeg.
Tested again, those known issues are still there:
# calib
[ FAILED ] Calib3d_StereoBM.regression
# videoio
[ FAILED ] videoio_ffmpeg.parallel
[ FAILED ] videoio/videocapture_acceleration.read/32, where GetParam() = (sample_322x242_15frames.yuv420p.mpeg2video.mp4, FFMPEG, NONE, false)
[ FAILED ] videoio/videocapture_acceleration.read/33, where GetParam() = (sample_322x242_15frames.yuv420p.mpeg2video.mp4, FFMPEG, NONE, true)
[ FAILED ] videoio/videocapture_acceleration.read/34, where GetParam() = (sample_322x242_15frames.yuv420p.mpeg2video.mp4, FFMPEG, ANY, false)
[ FAILED ] videoio/videocapture_acceleration.read/35, where GetParam() = (sample_322x242_15frames.yuv420p.mpeg2video.mp4, FFMPEG, ANY, true)
[ FAILED ] videoio/videocapture_acceleration.read/64, where GetParam() = (sample_322x242_15frames.yuv420p.libx265.mp4, FFMPEG, NONE, false)
[ FAILED ] videoio/videocapture_acceleration.read/65, where GetParam() = (sample_322x242_15frames.yuv420p.libx265.mp4, FFMPEG, NONE, true)
[ FAILED ] videoio/videocapture_acceleration.read/66, where GetParam() = (sample_322x242_15frames.yuv420p.libx265.mp4, FFMPEG, ANY, false)
[ FAILED ] videoio/videocapture_acceleration.read/67, where GetParam() = (sample_322x242_15frames.yuv420p.libx265.mp4, FFMPEG, ANY, true)
[ FAILED ] videoio/videocapture_acceleration.read/80, where GetParam() = (sample_322x242_15frames.yuv420p.libvpx-vp9.mp4, FFMPEG, NONE, false)
[ FAILED ] videoio/videocapture_acceleration.read/81, where GetParam() = (sample_322x242_15frames.yuv420p.libvpx-vp9.mp4, FFMPEG, NONE, true)
[ FAILED ] videoio/videocapture_acceleration.read/82, where GetParam() = (sample_322x242_15frames.yuv420p.libvpx-vp9.mp4, FFMPEG, ANY, false)
[ FAILED ] videoio/videocapture_acceleration.read/83, where GetParam() = (sample_322x242_15frames.yuv420p.libvpx-vp9.mp4, FFMPEG, ANY, true)
[ FAILED ] videoio/videocapture_acceleration.read/96, where GetParam() = (sample_322x242_15frames.yuv420p.libaom-av1.mp4, FFMPEG, NONE, false)
[ FAILED ] videoio/videocapture_acceleration.read/97, where GetParam() = (sample_322x242_15frames.yuv420p.libaom-av1.mp4, FFMPEG, NONE, true)
[ FAILED ] videoio/videocapture_acceleration.read/98, where GetParam() = (sample_322x242_15frames.yuv420p.libaom-av1.mp4, FFMPEG, ANY, false)
[ FAILED ] videoio/videocapture_acceleration.read/99, where GetParam() = (sample_322x242_15frames.yuv420p.libaom-av1.mp4, FFMPEG, ANY, true)
Tests on other modules passed.
Tested again, those known issues are still there
The version of ffmpeg has just been updated on the remote env, videoio module passed now.
Thanks a lot for update! Please take a look on "docs" builder. It reports a lot of formatting issues like "modules/core/include/opencv2/core/hal/intrin_lasx.hpp:1865: trailing whitespace."
OK, formatting issues are fixed.
The following test fails from time to time:
# core
[ FAILED ] Core/HAL.mat_decomp/15, where GetParam() = 15
It fails most likely in the first run of a fresh complilation, and passes from the second run.
And with the new ffmpeg, issues on videoio module were gone. However, it seems gtk is somehow misconfigured. At first, gtk was missing, but after installing gtk, the issue became:
[----------] 3 tests from Highgui_GUI
[ RUN ] Highgui_GUI.regression
Exception message: OpenCV(4.6.0-dev) /home/loongson/opencv-workspace/opencv-21833/modules/highgui/src/window_gtk.cpp:635: error: (-2:Unspecified error) Can't initialize GTK backend in function 'cvInitSystem'
/home/loongson/opencv-workspace/opencv-21833/modules/highgui/test/test_gui.cpp:72: Failure
Expected: namedWindow(window_name) doesn't throw an exception.
Actual: it throws.
[ FAILED ] Highgui_GUI.regression (2 ms)
[ RUN ] Highgui_GUI.trackbar_unsafe
Exception message: OpenCV(4.6.0-dev) /home/loongson/opencv-workspace/opencv-21833/modules/highgui/src/window_gtk.cpp:652: error: (-2:Unspecified error) GTK backend is not available in function 'cvInitSystem'
/home/loongson/opencv-workspace/opencv-21833/modules/highgui/test/test_gui.cpp:153: Failure
Expected: namedWindow(window_name) doesn't throw an exception.
Actual: it throws.
[ FAILED ] Highgui_GUI.trackbar_unsafe (0 ms)
[ RUN ] Highgui_GUI.trackbar
Exception message: OpenCV(4.6.0-dev) /home/loongson/opencv-workspace/opencv-21833/modules/highgui/src/window_gtk.cpp:652: error: (-2:Unspecified error) GTK backend is not available in function 'cvInitSystem'
/home/loongson/opencv-workspace/opencv-21833/modules/highgui/test/test_gui.cpp:192: Failure
Expected: namedWindow(window_name) doesn't throw an exception.
Actual: it throws.
[ FAILED ] Highgui_GUI.trackbar (0 ms)
[----------] 3 tests from Highgui_GUI (2 ms total)
I guess this is another software compatibility issue.
[ FAILED ] Core/HAL.mat_decomp/15, where GetParam() = 15 - it's compute test. Most probably it's a sign of UB somewhere in code or compiler issue.
[ RUN ] Highgui_GUI.regression - please check if X session is available and properly configured. Otherwise you need to build OpenCV without UI support.
[----------] 3 tests from Highgui_GUI
By using the following command to test on remote env, you can make the graphics display locally: $ export DISPLAY=:0.0; ./bin/opencv_test_highgui
[ FAILED ] Core/HAL.mat_decomp/15, where GetParam() = 15 - it's compute test. Most probably it's a sign of UB somewhere in code or compiler issue.
@asmorkalov Do you mean undefined behaviour by UB?
[ RUN ] Highgui_GUI.regression - please check if X session is available and properly configured. Otherwise you need to build OpenCV without UI support.
With the env set by export DISPLAY=:0.0
, highgui tests are now passed.
By using the following command to test on remote env, you can make the graphics display locally: $ export DISPLAY=:0.0; ./bin/opencv_test_highgui
Thanks!
UB is undefined behavior, bug that triggers randomly depending on not initialized variable or stack content in case of out of bound access.
core
[ FAILED ] Core/HAL.mat_decomp/15, where GetParam() = 15
This is a accuracy issue:
/home/loongson/src/wenxue/opencv/modules/core/test/test_hal_core.cpp:205: Failure Expected: (cvtest::norm(x, x0, NORM_INF | NORM_RELATIVE)) <= (eps), actual: 1.08289e-10 vs 1e-10
The reason is that the compiler uses fmadd class instructions to optimize the "multiply + add" operation, but precision loss may be triggered in parallel. We disable this feature on loongarch paltform by using the compiler option "-ffp-contract = off" and then the test case passes.
Now we add the compiler option "-ffp-contract = off" in opencv/modules/core/CMakeLists.txt. Is there a better place to add this option ?
fmadd is important optimization. IMHO we can tune test threshold a bit, but not disable it. @alalek @vpisarev what do you think?
@gititgo Any updates regarding the issue on the core module? Can we simply adjust the threshold specifically for LASX?
yes, tunning the test threshold is a good idea. If you have no objection, I will do this specifically for LASX.
@fengyuentau, @gititgo, I'm fine with increasing tolerance threshold from 1e-10 to 5e-10, for example
This merge has been done without necessary references in commits message on this PR.
Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add Loongson Advanced SIMD Extension support: -DCPU_BASELINE=LASX
* Add resize.lasx.cpp for Loongson SIMD acceleration
* Add imgwarp.lasx.cpp for Loongson SIMD acceleration
* Add LASX acceleration support for dnn/conv
* Add CV_PAUSE(v) for Loongarch
* Set LASX by default on Loongarch64
* LoongArch: tune test threshold for Core/HAL.mat_decomp/15
Co-authored-by: shengwenxue <shengwenxue@loongson.cn>
No PR ID at all.
Existed maintenance scripts would not care about this merge.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request