Convolution Test Failures and Skips on AArch64

oneapi-src / oneDNN

oneAPI Deep Neural Network Library (oneDNN)

https://uxlfoundation.org

Apache License 2.0

3.52k stars 970 forks source link

Convolution Test Failures and Skips on AArch64 #1962

Open kasturedeeksha opened 2 weeks ago

kasturedeeksha commented 2 weeks ago

When running benchdnn tests for convolution with the following command on aarch64: ./benchdnn --conv --dt=f32 --dir=FWD_D --batch=inputs/conv/test_conv_all

there are certain cases with --check-ref-impl=true that are failing because they are not supported on AArch64, while some cases are getting skipped and not showing as failed. Approximately 253 cases are failing. I doubt all unsupported tests should exhibit consistent behavior, either all should fail or all should be skipped.

Is this the expected behavior for unsupported cases on AArch64?
What is the criterion or difference between cases that are skipped versus those that fail?

jondea commented 2 weeks ago

I agree that the behavior should be consistent and platform independent. I think in the past, cases like this have been skipped until it is a case that we have an optimized implementation that we are willing to defend. If I'm honest, I'm not entirely sure that this is the correct behavior, especially with --check-ref-impl=true.

Specifically on AArch64, support for --dir=FWD_D has been limited.

vpirogov commented 1 week ago

@kasturedeeksha, there are two main reasons why benchdnn may skip a test:

functionality is expected to be unimplemented on current platform due to hardware limitations
not enough memory to run the problem Specific reason for particular case to be skipped is indicated in parenthesis after SKIPPED status. This is mainly done to simplify test coverage management across different platforms.

The --check-ref-impl=true option instructs benchdnn to fail a test if library has only reference implementation available. It exists exclusively for debug purposes. No tests from inputs/conv/test_conv_all use this option by default.

kasturedeeksha commented 1 week ago

@vpirogov @jondea Thanks for the clarification.

jondea commented 1 week ago

@vpirogov on a related note, would it be reasonable to have --skip-impl=ref for all benchdnn tests? What are you comparing against if you only have ref:any? I see it skipped in a lot of inputs, but not everywhere. For example, you could skip it everywhere by adding it to tests/benchdnn/CMakeLists.txt

vpirogov commented 1 week ago

@jondea, benchdnn has internal 'golden reference' implementation to test oneDNN. The skip-impl option is effectively a way to avoid wasting resources on validating reference implementation with tests intended to validate ISA specializations. Having it enabled everywhere would prevent reference implementation from being tested, which is undesirable in general.