intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
143 stars 44 forks source link

Skipped tests for rolling stable release 821 #797

Closed pbchekin closed 6 months ago

pbchekin commented 7 months ago

We have to skip 3 + 9 = 12 tests and 1 tutorial (08-grouped-gemm) with the new rolling stable release (821).

2024-04-02T21:06:54.9579160Z FAILED language/test_core.py::test_dot[1-64-64-64-4-False-False-add-rows-ieee-float16-float32] - AssertionError: 
2024-04-02T21:06:54.9579862Z Not equal to tolerance rtol=0.01, atol=0.001
2024-04-02T21:06:54.9580111Z 
2024-04-02T21:06:54.9580226Z Mismatched elements: 109 / 4096 (2.66%)
2024-04-02T21:06:54.9580552Z Max absolute difference: 0.1367
2024-04-02T21:06:54.9580842Z Max relative difference: 0.112
2024-04-02T21:06:54.9581213Z  x: array([[0.9087, 1.035 , 0.8047, ..., 1.025 , 0.6855, 1.1   ],
2024-04-02T21:06:54.9581663Z        [0.8516, 1.134 , 0.868 , ..., 1.048 , 0.838 , 1.057 ],
2024-04-02T21:06:54.9582084Z        [1.125 , 1.065 , 1.172 , ..., 1.177 , 1.087 , 1.19  ],...
2024-04-02T21:06:54.9582536Z  y: array([[0.9087, 1.035 , 0.8047, ..., 1.025 , 0.6855, 1.101 ],
2024-04-02T21:06:54.9582972Z        [0.8516, 1.134 , 0.8677, ..., 1.048 , 0.838 , 1.057 ],
2024-04-02T21:06:54.9583380Z        [1.125 , 1.065 , 1.172 , ..., 1.177 , 1.087 , 1.19  ],...
2024-04-02T21:06:54.9584206Z FAILED language/test_core.py::test_dot[1-64-64-64-4-False-False-add-cols-ieee-float16-float32] - AssertionError: 
2024-04-02T21:06:54.9584877Z Not equal to tolerance rtol=0.01, atol=0.001
2024-04-02T21:06:54.9585117Z 
2024-04-02T21:06:54.9585235Z Mismatched elements: 109 / 4096 (2.66%)
2024-04-02T21:06:54.9585551Z Max absolute difference: 0.1367
2024-04-02T21:06:54.9585845Z Max relative difference: 0.12
2024-04-02T21:06:54.9586209Z  x: array([[0.9087, 1.279 , 1.02  , ..., 1.072 , 0.853 , 1.265 ],
2024-04-02T21:06:54.9586642Z        [0.8047, 1.331 , 1.036 , ..., 1.048 , 0.9585, 1.174 ],
2024-04-02T21:06:54.9587071Z        [0.9136, 1.098 , 1.176 , ..., 1.012 , 1.043 , 1.144 ],...
2024-04-02T21:06:54.9587518Z  y: array([[0.9087, 1.279 , 1.0205, ..., 1.072 , 0.853 , 1.265 ],
2024-04-02T21:06:54.9587937Z        [0.8047, 1.331 , 1.036 , ..., 1.048 , 0.9585, 1.174 ],
2024-04-02T21:06:54.9588387Z        [0.9136, 1.098 , 1.176 , ..., 1.012 , 1.043 , 1.144 ],...
2024-04-02T21:06:54.9589149Z FAILED language/test_core.py::test_dot[1-64-64-64-4-False-False-none-ieee-float16-float32] - AssertionError: 
2024-04-02T21:06:54.9589789Z Not equal to tolerance rtol=0.01, atol=0.001
2024-04-02T21:06:54.9590039Z 
2024-04-02T21:06:54.9590156Z Mismatched elements: 124 / 4096 (3.03%)
2024-04-02T21:06:54.9590480Z Max absolute difference: 0.1373
2024-04-02T21:06:54.9590768Z Max relative difference: 122.75
2024-04-02T21:06:54.9591212Z  x: array([[-0.02618 ,  0.1003  , -0.1304  , ...,  0.09045 , -0.2494  ,
2024-04-02T21:06:54.9591600Z          0.165   ],
2024-04-02T21:06:54.9591976Z        [-0.1304  ,  0.1521  , -0.114   , ...,  0.06573 , -0.144   ,...
2024-04-02T21:06:54.9592500Z  y: array([[-0.02618 ,  0.1003  , -0.1304  , ...,  0.09045 , -0.2494  ,
2024-04-02T21:06:54.9592884Z          0.165   ],
2024-04-02T21:06:54.9593248Z        [-0.1304  ,  0.1521  , -0.114   , ...,  0.06573 , -0.144   ,...
FAILED operators/test_matmul.py::test_op[128-128-32-1-4-2-256-384-160-False-True-float16-float16-None-True-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 92530 / 98304 (94.1%)
Greatest absolute difference: 3.525390625 at index (8, 250) (up to 1e-05 allowed)
Greatest relative difference: inf at index (8, 0) (up to 0.001 allowed)
FAILED operators/test_matmul.py::test_op[128-128-32-1-4-2-256-384-160-False-True-bfloat16-bfloat16-None-True-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 89136 / 98304 (90.7%)
Greatest absolute difference: 8.239728901483491e+31 at index (143, 380) (up to 1e-05 allowed)
Greatest relative difference: inf at index (8, 0) (up to 0.016 allowed)
FAILED operators/test_matmul.py::test_op[128-128-32-1-4-4-256-256-160-False-True-bfloat16-bfloat16-None-True-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 59281 / 65536 (90.5%)
Greatest absolute difference: 3.546875 at index (8, 54) (up to 1e-05 allowed)
Greatest relative difference: 1.4621507953634075e+38 at index (136, 54) (up to 0.016 allowed)
FAILED operators/test_matmul.py::test_op[128-128-32-1-4-4-256-256-160-False-True-float16-float16-None-True-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 61672 / 65536 (94.1%)
Greatest absolute difference: 3.564453125 at index (8, 54) (up to 1e-05 allowed)
Greatest relative difference: inf at index (136, 0) (up to 0.001 allowed)
FAILED operators/test_matmul.py::test_op[128-256-32-1-8-2-None-None-None-False-True-float16-float8e5-None-True-None-None0] - AssertionError: Tensor-likes are not close!

Mismatched elements: 4926 / 32768 (15.0%)
Greatest absolute difference: 0.003871917724609375 at index (27, 31) (up to 1e-05 allowed)
Greatest relative difference: inf at index (0, 13) (up to 0.001 allowed)
FAILED operators/test_matmul.py::test_op[128-256-32-1-8-2-None-None-None-False-True-float16-int8-None-True-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 5350 / 32768 (16.3%)
Greatest absolute difference: 309.0 at index (27, 112) (up to 1e-05 allowed)
Greatest relative difference: inf at index (16, 0) (up to 0.001 allowed)
FAILED operators/test_matmul.py::test_op[128-256-32-1-8-2-None-None-None-False-True-float16-float8e5-None-False-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 4926 / 32768 (15.0%)
Greatest absolute difference: 0.003871917724609375 at index (27, 31) (up to 1e-05 allowed)
Greatest relative difference: inf at index (0, 13) (up to 0.001 allowed)
FAILED operators/test_matmul.py::test_op[128-256-32-1-8-2-None-None-None-False-True-float16-int8-None-False-None-None] - AssertionError: Tensor-likes are not close!

Mismatched elements: 5350 / 32768 (16.3%)
Greatest absolute difference: 309.0 at index (27, 112) (up to 1e-05 allowed)
Greatest relative difference: inf at index (16, 0) (up to 0.001 allowed)
FAILED operators/test_matmul.py::test_op[128-256-32-1-8-2-None-None-None-False-True-float16-float8e5-None-True-None-None1] - AssertionError: Tensor-likes are not close!

Mismatched elements: 4966 / 32768 (15.2%)
Greatest absolute difference: nan at index (16, 18) (up to 1e-05 allowed)
Greatest relative difference: nan at index (16, 18) (up to 0.001 allowed)
Traceback (most recent call last):
  File "/runner/_work/intel-xpu-backend-for-triton/intel-xpu-backend-for-triton/python/tutorials/08-grouped-gemm.py", line 211, in <module>
    assert torch.allclose(ref_out[i], tri_out[i], atol=1e-4, rtol=1e-3)
AssertionError
whitneywhtsang commented 7 months ago

Tested on agama 821.32, the 3 test_dot tests continue to fail, but the 9 test_matmul and tutorial 08-grouped-gemm now pass.

whitneywhtsang commented 7 months ago

All skipped tests are now reenabled, but keeping the issue to track testing with new Rolling driver.

whitneywhtsang commented 6 months ago

Verified that the listed tests all pass with 821.35.