Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.67k stars 315 forks source link

softmax-output-imagenet-test and softmax-output-smoketest test failed #184

Closed Danliran closed 4 years ago

Danliran commented 4 years ago

Hi NNPACK team, platform: aarch64 hardware os: ubuntu 18.04 arm64 softmax-output-imagenet-test and softmax-output-smoketest test case failed. It seems the result of nnp_softmax_output__reference and nnp_softmax_output is not the same.

[==========] Running 12 tests from 2 test cases. [----------] Global test environment set-up. [----------] 6 tests from OUT_OF_PLACE [ RUN ] OUT_OF_PLACE.batch1 /home/taishan/NNPACK/test/testers/softmax.h:128: Failure Expected: (maxError) < (errorLimit()), actual: 0.357721 vs 1e-05 [ FAILED ] OUT_OF_PLACE.batch1 (1 ms) [ RUN ] OUT_OF_PLACE.batch2 /home/taishan/NNPACK/test/testers/softmax.h:128: Failure Expected: (maxError) < (errorLimit()), actual: 0.368963 vs 1e-05 [ FAILED ] OUT_OF_PLACE.batch2 (0 ms) [ RUN ] OUT_OF_PLACE.batch16 /home/taishan/NNPACK/test/testers/softmax.h:128: Failure Expected: (maxError) < (errorLimit()), actual: 0.374419 vs 1e-05 [ FAILED ] OUT_OF_PLACE.batch16 (4 ms) [ RUN ] OUT_OF_PLACE.batch64 /home/taishan/NNPACK/test/testers/softmax.h:128: Failure Expected: (maxError) < (errorLimit()), actual: 0.391025 vs 1e-05 [ FAILED ] OUT_OF_PLACE.batch64 (17 ms) [ RUN ] OUT_OF_PLACE.batch128 /home/taishan/NNPACK/test/testers/softmax.h:128: Failure Expected: (maxError) < (errorLimit()), actual: 0.38515 vs 1e-05 [ FAILED ] OUT_OF_PLACE.batch128 (78 ms) [ RUN ] OUT_OF_PLACE.batch256 /home/taishan/NNPACK/test/testers/softmax.h:128: Failure Expected: (maxError) < (errorLimit()), actual: 0.381074 vs 1e-05 [ FAILED ] OUT_OF_PLACE.batch256 (102 ms) [----------] 6 tests from OUT_OF_PLACE (204 ms total)

[----------] 6 tests from IN_PLACE [ RUN ] IN_PLACE.batch1 /home/taishan/NNPACK/test/testers/softmax.h:157: Failure Expected: (maxError) < (errorLimit()), actual: 0.368789 vs 1e-05 [ FAILED ] IN_PLACE.batch1 (0 ms) [ RUN ] IN_PLACE.batch2 /home/taishan/NNPACK/test/testers/softmax.h:157: Failure Expected: (maxError) < (errorLimit()), actual: 0.368311 vs 1e-05 [ FAILED ] IN_PLACE.batch2 (0 ms) [ RUN ] IN_PLACE.batch16 /home/taishan/NNPACK/test/testers/softmax.h:157: Failure Expected: (maxError) < (errorLimit()), actual: 0.374365 vs 1e-05 [ FAILED ] IN_PLACE.batch16 (5 ms) [ RUN ] IN_PLACE.batch64 /home/taishan/NNPACK/test/testers/softmax.h:157: Failure Expected: (maxError) < (errorLimit()), actual: 0.381611 vs 1e-05 [ FAILED ] IN_PLACE.batch64 (16 ms) [ RUN ] IN_PLACE.batch128 /home/taishan/NNPACK/test/testers/softmax.h:157: Failure Expected: (maxError) < (errorLimit()), actual: 0.386007 vs 1e-05 [ FAILED ] IN_PLACE.batch128 (93 ms) [ RUN ] IN_PLACE.batch256 /home/taishan/NNPACK/test/testers/softmax.h:157: Failure Expected: (maxError) < (errorLimit()), actual: 0.385632 vs 1e-05 [ FAILED ] IN_PLACE.batch256 (98 ms) [----------] 6 tests from IN_PLACE (212 ms total)

[----------] Global test environment tear-down [==========] 12 tests from 2 test cases ran. (416 ms total) [ PASSED ] 0 tests. [ FAILED ] 12 tests, listed below: [ FAILED ] OUT_OF_PLACE.batch1 [ FAILED ] OUT_OF_PLACE.batch2 [ FAILED ] OUT_OF_PLACE.batch16 [ FAILED ] OUT_OF_PLACE.batch64 [ FAILED ] OUT_OF_PLACE.batch128 [ FAILED ] OUT_OF_PLACE.batch256 [ FAILED ] IN_PLACE.batch1 [ FAILED ] IN_PLACE.batch2 [ FAILED ] IN_PLACE.batch16 [ FAILED ] IN_PLACE.batch64 [ FAILED ] IN_PLACE.batch128 [ FAILED ] IN_PLACE.batch256

12 FAILED TESTS

Danliran commented 4 years ago

If we compile softmax funtion with "-O2/3" option, the test case will fail. I think the root reason is out of order execution, but I don`t know which step should add memory barrier or data and instruction sync function.

Maratyszcza commented 4 years ago

It is likely because of -ffast-math: https://github.com/Maratyszcza/NNPACK/blob/master/CMakeLists.txt#L467

Danliran commented 4 years ago

yes, you are right. I find it. Can I submit a PR to fix this issue on ARM platform? @Maratyszcza http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html -ffast-math

Sets the options -fno-math-errno, -funsafe-math-optimizations, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans, -fcx-limited-range and -fexcess-precision=fast.

This option causes the preprocessor macro __FAST_MATH__ to be defined.

This option is not turned on by any -O option besides -Ofast since it can result in incorrect output for programs that depend on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications.
Danliran commented 4 years ago

FIX this issue. PR: https://github.com/Maratyszcza/NNPACK/pull/185 @Maratyszcza