oneapi-src / oneMKL

oneAPI Math Kernel Library (oneMKL) Interfaces
Apache License 2.0
607 stars 155 forks source link

[ROCFFT] RocFFT fails tests when using ROCm 6.0 or later #559

Open hjabird opened 3 weeks ago

hjabird commented 3 weeks ago

Summary

The current tip of oneMKL interfaces fails unit tests when using ROCm 6.0 or later. Using ROCm 5.4.3 does not fail tests.

The failed tests are real-to-complex multi-dimensional tests.

Version

The issue was introduced with https://github.com/oneapi-src/oneMKL/pull/528 and can be reproduced with oneMKL interfaces developr.

Environment

Steps to reproduce

Build with RocFFT enabled. Test with

./bin/test_main_dft_ct --gtest_filter=*REAL_SINGLE_in_place_USM*batches_1*

to observe failures.

Observed behavior

Tests give wrong results or memory faults.

Expected behavior

Tests should pass.

hjabird commented 3 weeks ago
hjabird commented 3 weeks ago

This turned out to be a rocFFT bug. RocFFT issue https://github.com/ROCm/rocFFT/issues/504

hjabird commented 3 weeks ago

There is a work-around for the above at https://github.com/hjabird/oneMKL/tree/hjab/fix_rocfft6_issue

Unfortunately there is still are still failing tests - out-of-place complex 4x4x4_fwd_strides_2_4_1_16_bwd_strides_1_4_16_1_batches_2. These tests pass with ROCm 5.4.3, but fail with ROCm 5.7.1 and later.

hjabird commented 3 weeks ago

This looks like a bug in rocFFT to me. I've described the issue at https://github.com/ROCm/rocFFT/issues/507. I think we oneMKL Interfaces will have to throw unsupported on these tranposing DFTs.