migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

ArgMax/ArgMin inaccuracies #81

Closed attila-dusnoki-htec closed 10 months ago

attila-dusnoki-htec commented 12 months ago

Failing tests:

attila-dusnoki-htec commented 11 months ago

Related issue: https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/issues/556

attila-dusnoki-htec commented 11 months ago

Verbose logs

FAIL: test_argmax_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ``` ====================================================================== FAIL: test_argmax_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 290, in device_test_func return test_func(*args, device=device, **kwargs) File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 467, in run self.assert_similar_outputs( File "../test/py/onnx_backend_test.py", line 59, in assert_similar_outputs np.testing.assert_allclose(ref_outputs[i], File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 Program = module: "main" data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @1 = argmax[axis=1](data) -> int64_type, {2, 1}, {1, 1}, target_id=0 @2 = @return(@1), target_id=0 Compiled program = module: "main" @0 = check_context::migraphx::gpu::context -> float_type, {}, {}, target_id=0 @1 = hip::hip_allocate_memory[shape=int8_type, {32}, {1},id=main:scratch] -> int8_type, {32}, {1}, target_id=0 @2 = load[offset=16,end=32](@1) -> float_type, {2, 2}, {2, 1}, target_id=0 data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @4 = hip::copy_to_gpu(data,@2) -> float_type, {2, 2}, {2, 1}, target_id=0 @5 = load[offset=0,end=16](@1) -> int64_type, {2, 1}, {1, 1}, target_id=0 @6 = gpu::argmax[axis=1](@4,@5) -> int64_type, {2, 1}, {1, 1}, target_id=0 @7 = hip::copy_from_gpu(@6) -> int64_type, {2, 1}, {1, 1}, target_id=0 @8 = hip::sync_stream(@7) -> int64_type, {2, 1}, {1, 1}, target_id=0 @9 = @return(@8), target_id=0 Mismatched elements: 1 / 2 (50%) Max absolute difference: 1 Max relative difference: 0. x: array([[1], [1]]) y: array([[0], [1]], dtype=int64) ```
FAIL: test_argmax_negative_axis_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ``` ====================================================================== FAIL: test_argmax_negative_axis_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 290, in device_test_func return test_func(*args, device=device, **kwargs) File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 467, in run self.assert_similar_outputs( File "../test/py/onnx_backend_test.py", line 59, in assert_similar_outputs np.testing.assert_allclose(ref_outputs[i], File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 Program = module: "main" data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @1 = argmax[axis=-1](data) -> int64_type, {2, 1}, {1, 1}, target_id=0 @2 = @return(@1), target_id=0 Compiled program = module: "main" @0 = check_context::migraphx::gpu::context -> float_type, {}, {}, target_id=0 @1 = hip::hip_allocate_memory[shape=int8_type, {32}, {1},id=main:scratch] -> int8_type, {32}, {1}, target_id=0 @2 = load[offset=16,end=32](@1) -> float_type, {2, 2}, {2, 1}, target_id=0 data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @4 = hip::copy_to_gpu(data,@2) -> float_type, {2, 2}, {2, 1}, target_id=0 @5 = load[offset=0,end=16](@1) -> int64_type, {2, 1}, {1, 1}, target_id=0 @6 = gpu::argmax[axis=1](@4,@5) -> int64_type, {2, 1}, {1, 1}, target_id=0 @7 = hip::copy_from_gpu(@6) -> int64_type, {2, 1}, {1, 1}, target_id=0 @8 = hip::sync_stream(@7) -> int64_type, {2, 1}, {1, 1}, target_id=0 @9 = @return(@8), target_id=0 Mismatched elements: 1 / 2 (50%) Max absolute difference: 1 Max relative difference: 0. x: array([[1], [1]]) y: array([[0], [1]], dtype=int64) ```
FAIL: test_argmax_no_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ``` ====================================================================== FAIL: test_argmax_no_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 290, in device_test_func return test_func(*args, device=device, **kwargs) File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 467, in run self.assert_similar_outputs( File "../test/py/onnx_backend_test.py", line 59, in assert_similar_outputs np.testing.assert_allclose(ref_outputs[i], File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 Program = module: "main" data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @1 = argmax[axis=1](data) -> int64_type, {2, 1}, {1, 1}, target_id=0 @2 = squeeze[axes={1}](@1) -> int64_type, {2}, {1}, target_id=0 @3 = @return(@2), target_id=0 Compiled program = module: "main" @0 = check_context::migraphx::gpu::context -> float_type, {}, {}, target_id=0 @1 = hip::hip_allocate_memory[shape=int8_type, {32}, {1},id=main:scratch] -> int8_type, {32}, {1}, target_id=0 data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @3 = load[offset=16,end=32](@1) -> float_type, {2, 2}, {2, 1}, target_id=0 @4 = hip::copy_to_gpu(data,@3) -> float_type, {2, 2}, {2, 1}, target_id=0 @5 = load[offset=0,end=16](@1) -> int64_type, {2, 1}, {1, 1}, target_id=0 @6 = gpu::argmax[axis=1](@4,@5) -> int64_type, {2, 1}, {1, 1}, target_id=0 @7 = load[offset=16,end=32](@1) -> int64_type, {2}, {1}, target_id=0 @8 = reshape[dims={2}](@6) -> int64_type, {2}, {1}, target_id=0 @9 = gpu::code_object[code_object=9552,symbol_name=convert_convert_kernel,global=1024,local=1024,](@8,@7) -> int64_type, {2}, {1}, target_id=0 @10 = hip::copy_from_gpu(@9) -> int64_type, {2}, {1}, target_id=0 @11 = hip::sync_stream(@10) -> int64_type, {2}, {1}, target_id=0 @12 = @return(@11), target_id=0 Mismatched elements: 1 / 2 (50%) Max absolute difference: 1 Max relative difference: 0. x: array([1, 1]) y: array([0, 1], dtype=int64) ```
FAIL: test_argmin_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ``` ====================================================================== FAIL: test_argmin_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 290, in device_test_func return test_func(*args, device=device, **kwargs) File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 467, in run self.assert_similar_outputs( File "../test/py/onnx_backend_test.py", line 59, in assert_similar_outputs np.testing.assert_allclose(ref_outputs[i], File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 Program = module: "main" data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @1 = argmin[axis=1](data) -> int64_type, {2, 1}, {1, 1}, target_id=0 @2 = @return(@1), target_id=0 Compiled program = module: "main" @0 = check_context::migraphx::gpu::context -> float_type, {}, {}, target_id=0 @1 = hip::hip_allocate_memory[shape=int8_type, {32}, {1},id=main:scratch] -> int8_type, {32}, {1}, target_id=0 data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @3 = load[offset=16,end=32](@1) -> float_type, {2, 2}, {2, 1}, target_id=0 @4 = hip::copy_to_gpu(data,@3) -> float_type, {2, 2}, {2, 1}, target_id=0 @5 = load[offset=0,end=16](@1) -> int64_type, {2, 1}, {1, 1}, target_id=0 @6 = gpu::argmin[axis=1](@4,@5) -> int64_type, {2, 1}, {1, 1}, target_id=0 @7 = hip::copy_from_gpu(@6) -> int64_type, {2, 1}, {1, 1}, target_id=0 @8 = hip::sync_stream(@7) -> int64_type, {2, 1}, {1, 1}, target_id=0 @9 = @return(@8), target_id=0 Mismatched elements: 1 / 2 (50%) Max absolute difference: 1 Max relative difference: inf x: array([[1], [0]]) y: array([[0], [0]], dtype=int64) ```
FAIL: test_argmin_negative_axis_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ``` ====================================================================== FAIL: test_argmin_negative_axis_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 290, in device_test_func return test_func(*args, device=device, **kwargs) File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 467, in run self.assert_similar_outputs( File "../test/py/onnx_backend_test.py", line 59, in assert_similar_outputs np.testing.assert_allclose(ref_outputs[i], File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 Program = module: "main" data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @1 = argmin[axis=-1](data) -> int64_type, {2, 1}, {1, 1}, target_id=0 @2 = @return(@1), target_id=0 Compiled program = module: "main" @0 = check_context::migraphx::gpu::context -> float_type, {}, {}, target_id=0 @1 = hip::hip_allocate_memory[shape=int8_type, {32}, {1},id=main:scratch] -> int8_type, {32}, {1}, target_id=0 data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @3 = load[offset=16,end=32](@1) -> float_type, {2, 2}, {2, 1}, target_id=0 @4 = hip::copy_to_gpu(data,@3) -> float_type, {2, 2}, {2, 1}, target_id=0 @5 = load[offset=0,end=16](@1) -> int64_type, {2, 1}, {1, 1}, target_id=0 @6 = gpu::argmin[axis=1](@4,@5) -> int64_type, {2, 1}, {1, 1}, target_id=0 @7 = hip::copy_from_gpu(@6) -> int64_type, {2, 1}, {1, 1}, target_id=0 @8 = hip::sync_stream(@7) -> int64_type, {2, 1}, {1, 1}, target_id=0 @9 = @return(@8), target_id=0 Mismatched elements: 1 / 2 (50%) Max absolute difference: 1 Max relative difference: inf x: array([[1], [0]]) y: array([[0], [0]], dtype=int64) ```
FAIL: test_argmin_no_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ``` ====================================================================== FAIL: test_argmin_no_keepdims_example_select_last_index_cpu (__main__.OnnxBackendNodeModelTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 290, in device_test_func return test_func(*args, device=device, **kwargs) File "/usr/local/lib/python3.8/dist-packages/onnx/backend/test/runner/__init__.py", line 467, in run self.assert_similar_outputs( File "../test/py/onnx_backend_test.py", line 59, in assert_similar_outputs np.testing.assert_allclose(ref_outputs[i], File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 Program = module: "main" data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @1 = argmin[axis=1](data) -> int64_type, {2, 1}, {1, 1}, target_id=0 @2 = squeeze[axes={1}](@1) -> int64_type, {2}, {1}, target_id=0 @3 = @return(@2), target_id=0 Compiled program = module: "main" @0 = check_context::migraphx::gpu::context -> float_type, {}, {}, target_id=0 @1 = hip::hip_allocate_memory[shape=int8_type, {32}, {1},id=main:scratch] -> int8_type, {32}, {1}, target_id=0 data = @param:data -> float_type, {2, 2}, {2, 1}, target_id=0 @3 = load[offset=16,end=32](@1) -> float_type, {2, 2}, {2, 1}, target_id=0 @4 = hip::copy_to_gpu(data,@3) -> float_type, {2, 2}, {2, 1}, target_id=0 @5 = load[offset=0,end=16](@1) -> int64_type, {2, 1}, {1, 1}, target_id=0 @6 = gpu::argmin[axis=1](@4,@5) -> int64_type, {2, 1}, {1, 1}, target_id=0 @7 = reshape[dims={2}](@6) -> int64_type, {2}, {1}, target_id=0 @8 = load[offset=16,end=32](@1) -> int64_type, {2}, {1}, target_id=0 @9 = gpu::code_object[code_object=9552,symbol_name=convert_convert_kernel,global=1024,local=1024,](@7,@8) -> int64_type, {2}, {1}, target_id=0 @10 = hip::copy_from_gpu(@9) -> int64_type, {2}, {1}, target_id=0 @11 = hip::sync_stream(@10) -> int64_type, {2}, {1}, target_id=0 @12 = @return(@11), target_id=0 Mismatched elements: 1 / 2 (50%) Max absolute difference: 1 Max relative difference: inf x: array([1, 0]) y: array([0, 0], dtype=int64) ```
attila-dusnoki-htec commented 11 months ago

https://github.com/onnx/onnx/blob/main/docs/Operators.md#ArgMax The select_last_index is missing.

The place which needs updating: https://github.com/migraphx-benchmark/AMDMIGraphX/blob/develop/src/onnx/parse_arg_op.cpp

Check the tests what is excepted: https://github.com/onnx/onnx/blob/main/docs/TestCoverage.md#argmax The test source: https://github.com/onnx/onnx/blob/main/onnx/backend/test/case/node/argmax.py