Lightning-AI / lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Apache License 2.0
1.17k stars 77 forks source link

Failing ci tests involving attention on pt dev (2.6) #1254

Open t-vi opened 2 weeks ago

t-vi commented 2 weeks ago

Disabling for now.

FAILED thunder/tests/test_grad.py::test_populate_grads_csa_torch_cuda_thunder.dtypes.float32 - AssertionError: Tensor-likes are not close!

Mismatched elements: 218295 / 589824 (37.0%)
Greatest absolute difference: 1030.24462890625 at index (473, 722) (up to 0.01 allowed)
Greatest relative difference: 2295.604248046875 at index (482, 518) (up to 0.01 allowed)

The failure occurred for item [2]
FAILED thunder/tests/test_grad.py::test_populate_grads_block_nvfuser_cuda_thunder.dtypes.float32 - AssertionError: Tensor-likes are not close!

Mismatched elements: 768 / 768 (100.0%)
Greatest absolute difference: 2461.907470703125 at index (275,) (up to 0.01 allowed)
Greatest relative difference: 15.106374740600586 at index (668,) (up to 0.01 allowed)

would be good to investigate, but I could not reproduce.

cc @borda

tfogal commented 5 days ago

triage review: let's re-enable the test and see if this was transient or is a real issue to be looked into.