wkcn / MobulaOP

A Simple & Flexible Cross Framework Operators Toolkit
MIT License
164 stars 21 forks source link

fix CUDA building and ROIAlign test #25

Closed mgno32 closed 5 years ago

mgno32 commented 5 years ago

Hi. In the following environment, the building will fail.

Ubuntu 16.04
Cuda compilation tools, release 7.5, V7.5.17

Error Message:

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined                       

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined                         

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined                       

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined                         

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined                       

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00002f8c_00000000-7_context.cpp1.ii".
2 errors detected in the compilation of "/tmp/tmpxft_00002f8e_00000000-7_im2col.cpp1.ii".
2 errors detected in the compilation of "/tmp/tmpxft_00002f8f_00000000-7_defines.cpp1.ii".

In addition, the unittest for ROIAlign fails when testing on GPU.

AssertionError: Location of maximum error: (2, 0, 1, 2)
Maximum Absolute Error(0.00048828125) > atol(1e-05): 2349.3857421875 vs 2349.38623046875

I reduce atol and rtol and fix it.

coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 425


Totals Coverage Status
Change from base Build 422: -0.8%
Covered Lines: 1157
Relevant Lines: 1468

💛 - Coveralls
wkcn commented 5 years ago

Thanks for your contribution!

I will check it.

wkcn commented 5 years ago

It seems there is a bug when using GCC 5.0 and NVCC 7.5. It is strange the precision of ROIAlign OP test.

LGTM.

Thank you!

wkcn commented 5 years ago

For the issue of the float accuracy in ROIAlign unittest, nvcc optimizes the code and change the order of computation, which causes the accuracy problem. Adding the flag -O0 -Xcicc -O0 -Xptxas -O0 into the flags of nvcc can get the same results.