Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Apache License 2.0
1.16k
stars
77
forks
source link
vjp correctness fails for sdpa manual grad forward sdpa #703
From the CI run in an intermediate version of #691 :
FAILED thunder/tests/test_grad.py::test_vjp_correctness_sdpa_manual_grad_forward_scaled_dot_product_attention_nvfuser_cuda_thunder.dtypes.float16 - AssertionError: Tensor-likes are not close!
Mismatched elements: 19624 / 245760 (8.0%)
Greatest absolute difference: 0.125 at index (1, 1, 60, 64) (up to 1e-05 allowed)
Greatest relative difference: inf at index (0, 0, 54, 6) (up to 0.001 allowed)
FAILED thunder/tests/test_grad.py::test_vjp_correctness_sdpa_manual_grad_forward_scaled_dot_product_attention_nvfuser_cuda_thunder.dtypes.bfloat16 - AssertionError: Tensor-likes are not close!
Mismatched elements: 7563 / 180224 (4.2%)
Greatest absolute difference: 1.046875 at index (6, 0, 86, 73) (up to 1e-05 allowed)
Greatest relative difference: inf at index (0, 1, 22, 61) (up to 0.016 allowed)
= 2 failed, 4667 passed, 823 skipped, 108 xfailed, 96 xpassed, 119791 warnings in 594.94s (0:09:54) =
@vedaanta I tentatively assigned it to you because #691 is your PR...
From the CI run in an intermediate version of #691 :
FAILED thunder/tests/test_grad.py::test_vjp_correctness_sdpa_manual_grad_forward_scaled_dot_product_attention_nvfuser_cuda_thunder.dtypes.float16 - AssertionError: Tensor-likes are not close!
Mismatched elements: 19624 / 245760 (8.0%) Greatest absolute difference: 0.125 at index (1, 1, 60, 64) (up to 1e-05 allowed) Greatest relative difference: inf at index (0, 0, 54, 6) (up to 0.001 allowed) FAILED thunder/tests/test_grad.py::test_vjp_correctness_sdpa_manual_grad_forward_scaled_dot_product_attention_nvfuser_cuda_thunder.dtypes.bfloat16 - AssertionError: Tensor-likes are not close!
Mismatched elements: 7563 / 180224 (4.2%) Greatest absolute difference: 1.046875 at index (6, 0, 86, 73) (up to 1e-05 allowed) Greatest relative difference: inf at index (0, 1, 22, 61) (up to 0.016 allowed) = 2 failed, 4667 passed, 823 skipped, 108 xfailed, 96 xpassed, 119791 warnings in 594.94s (0:09:54) =
@vedaanta I tentatively assigned it to you because #691 is your PR...