intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
143 stars 44 forks source link

[tracking] Cases fail on Rolling 881.19 #1380

Closed AshburnLee closed 3 months ago

AshburnLee commented 5 months ago

On branch llvm-target with Rolling 881.19. There are 4 test_dot cases faile:

test_core.py::test_dot[1-64-128-128-4-True-True-none-tf32-int8-int8-1_0]
test_core.py::test_dot[1-64-128-128-4-True-True-none-tf32-int8-int8-1_1]
test_core.py::test_dot[1-64-128-128-4-False-True-none-tf32-int8-int8-1_0]
test_core.py::test_dot[1-64-128-128-4-False-True-none-tf32-int8-int8-1_1]

Still fail:

FAILED python/test/unit/language/test_core.py::test_dot[1-64-128-128-4-True-True-none-tf32-int8-int8-1_0] - AssertionError: 
Not equal to tolerance rtol=0.01, atol=0.001

Mismatched elements: 1920 / 8192 (23.4%)
Max absolute difference: 26243
Max relative difference: 223.2195122
 x: array([[  -8771,   97832,   94483, ...,   35111,   22460,   70087],
       [-124117, -115193,    1862, ...,   -2921,  -65821,  -26059],
       [ -55616,   23155,   64353, ...,  100224,   18555,   42140],...
 y: array([[ -11325,  103270,   91719, ...,   35111,   22460,   70087],
       [-126417, -110689,    -804, ...,   -2921,  -65821,  -26059],
       [ -64658,   25435,   46932, ...,  100224,   18555,   42140],...

FAILED python/test/unit/language/test_core.py::test_dot[1-64-128-128-4-True-True-none-tf32-int8-int8-1_1] - AssertionError: 
Not equal to tolerance rtol=0.01, atol=0.001

Mismatched elements: 1920 / 8192 (23.4%)
Max absolute difference: 26243
Max relative difference: 223.2195122
 x: array([[  -8771,   97832,   94483, ...,   35111,   22460,   70087],
       [-124117, -115193,    1862, ...,   -2921,  -65821,  -26059],
       [ -55616,   23155,   64353, ...,  100224,   18555,   42140],...
 y: array([[ -11325,  103270,   91719, ...,   35111,   22460,   70087],
       [-126417, -110689,    -804, ...,   -2921,  -65821,  -26059],
       [ -64658,   25435,   46932, ...,  100224,   18555,   42140],...

FAILED python/test/unit/language/test_core.py::test_dot[1-64-128-128-4-False-True-none-tf32-int8-int8-1_0] - AssertionError: 
Not equal to tolerance rtol=0.01, atol=0.001

Mismatched elements: 1939 / 8192 (23.7%)
Max absolute difference: 28162
Max relative difference: 38.43896714
 x: array([[ -70131,  -76608,   36342, ...,    -437,    2562,   37103],
       [ -10398,   -2496, -111479, ...,  114795,   89840,   25838],
       [  84214,  -54418, -112739, ...,  -14705,   70750,   20548],...
 y: array([[ -69713,  -92934,   29850, ...,    -437,    2562,   37103],
       [ -12534,   14510, -108186, ...,  114795,   89840,   25838],
       [  80760,  -35494, -111272, ...,  -14705,   70750,   20548],...

FAILED python/test/unit/language/test_core.py::test_dot[1-64-128-128-4-False-True-none-tf32-int8-int8-1_1] - AssertionError: 
Not equal to tolerance rtol=0.01, atol=0.001

Mismatched elements: 1939 / 8192 (23.7%)
Max absolute difference: 28162
Max relative difference: 38.43896714
 x: array([[ -70131,  -76608,   36342, ...,    -437,    2562,   37103],
       [ -10398,   -2496, -111479, ...,  114795,   89840,   25838],
       [  84214,  -54418, -112739, ...,  -14705,   70750,   20548],...
 y: array([[ -69713,  -92934,   29850, ...,    -437,    2562,   37103],
       [ -12534,   14510, -108186, ...,  114795,   89840,   25838],
       [  80760,  -35494, -111272, ...,  -14705,   70750,   20548],...
AshburnLee commented 4 months ago

Still fail on 881.19

AshburnLee commented 4 months ago

Still fail on 881.19

vlad-penkin commented 3 months ago

@AshburnLee can you retest failed test cases with Agama 914.12?

AshburnLee commented 3 months ago

@pbchekin Is there any machine or container with Agama 914.12 available ?

pbchekin commented 3 months ago

@pbchekin Is there any machine or container with Agama 914.12 available ?

914.32 is the latest agama rolling release, please use that. You can use a JupyterHub session with the corresponding profile.

AshburnLee commented 3 months ago

4 cases passed on Agama 914.32.