[To Evaluate] accuracy issue in fine-grained test

daisyden commented 5 months ago

TBD：

[] test_compare_cpu tanh complex support has accuracy gap - Sycl issue on device, submitted jira https://jira.devtools.intel.com/browse/CMPLRLLVM-60397 and https://jira.devtools.intel.com/browse/CMPLRLIBS-34974, will verify "PYTORCH_TEST_WITH_SLOW=1 /home/gta/miniforge3/envs/pytorch2.5/bin/pytest -v test_ops_xpu.py -k test_compare_cpu_tanh_xpu_complex64" with 2025.0.
[x] bfloat16 accuracy gap in rsqrt, sub, rounding, cumsum, add, rsub (low priority)
[x] float16 cumsum accuracy gap
[x] pow, mul, log, complex64 got nan, cuda has the same failure on pow and mul.

🐛 Describe the bug

I extended the fine-grained test to run all the xpu support ops and dtypes with test_compare_cpu() test. Please see branch daisyden/fin_grain. To run it with command:

cd torch-xpu-ops/test/xpu/fin_grain
export PYTORCH_TEST_WITH_SLOW=1
bash run_fin_grain.sh

I got the following failures in the end.

With analysis, we could have several issues:

test_compare_cpu tanh complex support has accuracy gap.

bfloat16 accuracy gap in rsqrt, sub, rounding, cumsum, add, rsub,

float16 cumsum accuracy gap

pow, mul, log, complex64 got nan

index_put , index_add with bool

rounding float36 got inf

XPU reports "not implemented" with dtype bool, int*, uint8, while CPU would not report such error message.

Versions

myenv.log

daisyden commented 5 months ago

Full test log log.txt

chuanqi129 commented 3 months ago

@daisyden @huaiyuzh please refresh the status of this triage issue

daisyden commented 3 months ago

test_ops_xpu.py::TestCommonXPU::test_compare_cpu_div_trunc_rounding_xpu_float16 PASSED test_ops_xpu.py::TestCommonXPU::test_compare_cpu_index_put_xpu_bool PASSED other issues are still there

daisyden commented 3 months ago

For 4, cuda got the similar issue on test_compare_cpu_pow_cuda_complex64