Open mobicham opened 1 month ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
transformers
version: 4.45.2Who can help?
I noticed that many files in transformers use the older sdp api
torch.backends.cuda.sdp_kernel
. We just discovered a bug in Pytorch 2.5.0 and the old sdp api that would make it run slower https://github.com/pytorch/pytorch/issues/138386It would be a good idea to update to the new api (
from torch.nn.attention import sdpa_kernel, SDPBackend
) and set the appropriate compile flag to avoid losing as much as 20% of the performance !Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Gist here as a reference example: https://gist.github.com/mobicham/aa1e77689d9cf866cbea2cb75a53a9e4 More details in the torch issue: https://github.com/pytorch/pytorch/issues/138386
Expected behavior
Examples using sdp with torch 2.5.0 should run at least as fast as 2.4.1