Bump torch from 2.4.1 to 2.5.1

Bumps torch from 2.4.1 to 2.5.1.

Release notes

PyTorch 2.5.1: bug fix release

This release is meant to fix the following regressions:

Wheels from PyPI are unusable out of the box on PRM-based Linux distributions: pytorch/pytorch#138324

PyPI arm64 distribution logs cpuinfo error on import: pytorch/pytorch#138333

Crash When Using torch.compile with Math scaled_dot_product_attention in AMP Mode: pytorch/pytorch#133974

[MPS] Internal crash due to the invalid buffer size computation if sliced API is used: pytorch/pytorch#137800

Several issues related to CuDNN Attention: pytorch/pytorch#138522

Besides the regression fixes, the release includes several documentation updates.

See release tracker pytorch/pytorch#132400 for additional information.

PyTorch 2.5.0 Release, SDPA CuDNN backend, Flex Attention

PyTorch 2.5 Release Notes

Highlights

Backwards Incompatible Change

Deprecations

New Features

Improvements

Bug fixes

Performance

Documentation

Developers

Security

Highlights

We are excited to announce the release of PyTorch® 2.5! This release features a new CuDNN backend for SDPA, enabling speedups by default for users of SDPA on H100s or newer GPUs. As well, regional compilation of torch.compile offers a way to reduce the cold start up time for torch.compile by allowing users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Finally, TorchInductor CPP backend offers solid performance speedup with numerous enhancements like FP16 support, CPP wrapper, AOT-Inductor mode, and max-autotune mode. This release is composed of 4095 commits from 504 contributors since PyTorch 2.4. We want to sincerely thank our dedicated community for your contributions. As always, we encourage you to try these out and report any issues as we improve 2.5. More information about how to get started with the PyTorch 2-series can be found at our Getting Started page. As well, please check out our new ecosystem projects releases with TorchRec and TorchFix.

Beta Prototype

CuDNN backend for SDPA FlexAttention

torch.compile regional compilation without recompilations Compiled Autograd

TorchDynamo added support for exception handling & MutableMapping types Flight Recorder

TorchInductor CPU backend optimization Max-autotune Support on CPU with GEMM Template

TorchInductor on Windows

FP16 support on CPU path for both eager mode and TorchInductor CPP backend

Autoload Device Extension

Enhanced Intel GPU support

*To see a full list of public feature submissions click here.

BETA FEATURES

[Beta] CuDNN backend for SDPA

The cuDNN "Fused Flash Attention" backend was landed for torch.nn.functional.scaled_dot_product_attention. On NVIDIA H100 GPUs this can provide up to 75% speed-up over FlashAttentionV2. This speedup is enabled by default for all users of SDPA on H100 or newer GPUs.

[Beta] torch.compile regional compilation without recompilations

Regional compilation without recompilations, via torch._dynamo.config.inline_inbuilt_nn_modules which default to True in 2.5+. This option allows users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Compared to compiling the full model, this option can result in smaller compilation latencies with 1%-5% performance degradation compared to full model compilation.

Beta	Prototype
CuDNN backend for SDPA	FlexAttention
torch.compile regional compilation without recompilations	Compiled Autograd
TorchDynamo added support for exception handling & MutableMapping types	Flight Recorder
TorchInductor CPU backend optimization	Max-autotune Support on CPU with GEMM Template
	TorchInductor on Windows
	FP16 support on CPU path for both eager mode and TorchInductor CPP backend
	Autoload Device Extension
	Enhanced Intel GPU support

... (truncated)

Commits

a8d6afb Disabling amp context when invoking compiler (#138659)
f31b8bb [MPS] Fix sliced cast (#138535)
848e7ac [SDPA-CUDNN] Make CuDNN Attention Opt in (#138587)
885c823 Update doc copyrights to 2024 (#138650)
8c3ed97 Update cpuinfo submodule (#138600)
70cf2bb Add link to torch.compile the missing manual in troubleshooting (#137369)
cde6b38 Don't try to load cufile (#138539)
4076a73 [Cherry-Pick] Use cuda 12.4 pytorch_extra_install_requirements as default (#1...
a97c151 update getting started xpu (#138090)
32f585d [Release only] use triton 3.1.x from pypi (#137895)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

nlpie / biomedicus

Bump torch from 2.4.1 to 2.5.1 #430

PyTorch 2.5.1: bug fix release

PyTorch 2.5.0 Release, SDPA CuDNN backend, Flex Attention

PyTorch 2.5 Release Notes

Highlights

BETA FEATURES

[Beta] CuDNN backend for SDPA

[Beta] torch.compile regional compilation without recompilations