Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
We are excited to announce the release of PyTorch® 2.5! This release features a new CuDNN backend for SDPA, enabling speedups by default for users of SDPA on H100s or newer GPUs. As well, regional compilation of torch.compile offers a way to reduce the cold start up time for torch.compile by allowing users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Finally, TorchInductor CPP backend offers solid performance speedup with numerous enhancements like FP16 support, CPP wrapper, AOT-Inductor mode, and max-autotune mode.
This release is composed of 4095 commits from 504 contributors since PyTorch 2.4. We want to sincerely thank our dedicated community for your contributions. As always, we encourage you to try these out and report any issues as we improve 2.5. More information about how to get started with the PyTorch 2-series can be found at our Getting Started page.
As well, please check out our new ecosystem projects releases with TorchRec and TorchFix.
Beta
Prototype
CuDNN backend for SDPA
FlexAttention
torch.compile regional compilation without recompilations
Compiled Autograd
TorchDynamo added support for exception handling & MutableMapping types
Flight Recorder
TorchInductor CPU backend optimization
Max-autotune Support on CPU with GEMM Template
TorchInductor on Windows
FP16 support on CPU path for both eager mode and TorchInductor CPP backend
Autoload Device Extension
Enhanced Intel GPU support
*To see a full list of public feature submissions click here.
BETA FEATURES
[Beta] CuDNN backend for SDPA
The cuDNN "Fused Flash Attention" backend was landed for torch.nn.functional.scaled_dot_product_attention. On NVIDIA H100 GPUs this can provide up to 75% speed-up over FlashAttentionV2. This speedup is enabled by default for all users of SDPA on H100 or newer GPUs.
[Beta] torch.compile regional compilation without recompilations
Regional compilation without recompilations, via torch._dynamo.config.inline_inbuilt_nn_modules which default to True in 2.5+. This option allows users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Compared to compiling the full model, this option can result in smaller compilation latencies with 1%-5% performance degradation compared to full model compilation.
... (truncated)
Commits
a8d6afb Disabling amp context when invoking compiler (#138659)
community: release 0.2.19 (#28057)
community: patch graphqa chains (CVE-2024-8309) (#28050)
langchain,community[patch]: release with bumped core (#27854)
Added mapping to fix CI for #langchain-aws:227. (#27114)
community: poetry lock for cffi dep (#26674)
langchain-community==0.2.18
Changes since langchain-community==0.3.5
langchain,community[patch]: release with bumped core (#27854)
Added mapping to fix CI for #langchain-aws:227. (#27114)
community: poetry lock for cffi dep (#26674)
We are excited to announce the release of PyTorch® 2.5! This release features a new CuDNN backend for SDPA, enabling speedups by default for users of SDPA on H100s or newer GPUs. As well, regional compilation of torch.compile offers a way to reduce the cold start up time for torch.compile by allowing users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Finally, TorchInductor CPP backend offers solid performance speedup with numerous enhancements like FP16 support, CPP wrapper, AOT-Inductor mode, and max-autotune mode.
This release is composed of 4095 commits from 504 contributors since PyTorch 2.4. We want to sincerely thank our dedicated community for your contributions. As always, we encourage you to try these out and report any issues as we improve 2.5. More information about how to get started with the PyTorch 2-series can be found at our Getting Started page.
As well, please check out our new ecosystem projects releases with TorchRec and TorchFix.
Beta
Prototype
CuDNN backend for SDPA
FlexAttention
torch.compile regional compilation without recompilations
Compiled Autograd
TorchDynamo added support for exception handling & MutableMapping types
Flight Recorder
TorchInductor CPU backend optimization
Max-autotune Support on CPU with GEMM Template
TorchInductor on Windows
FP16 support on CPU path for both eager mode and TorchInductor CPP backend
Autoload Device Extension
Enhanced Intel GPU support
*To see a full list of public feature submissions click here.
BETA FEATURES
[Beta] CuDNN backend for SDPA
The cuDNN "Fused Flash Attention" backend was landed for torch.nn.functional.scaled_dot_product_attention. On NVIDIA H100 GPUs this can provide up to 75% speed-up over FlashAttentionV2. This speedup is enabled by default for all users of SDPA on H100 or newer GPUs.
[Beta] torch.compile regional compilation without recompilations
Regional compilation without recompilations, via torch._dynamo.config.inline_inbuilt_nn_modules which default to True in 2.5+. This option allows users to compile a repeated nn.Module (e.g. a transformer layer in LLM) without recompilations. Compared to compiling the full model, this option can result in smaller compilation latencies with 1%-5% performance degradation compared to full model compilation.
... (truncated)
Commits
a8d6afb Disabling amp context when invoking compiler (#138659)
community: release 0.2.19 (#28057)
community: patch graphqa chains (CVE-2024-8309) (#28050)
langchain,community[patch]: release with bumped core (#27854)
Added mapping to fix CI for #langchain-aws:227. (#27114)
community: poetry lock for cffi dep (#26674)
langchain-community==0.2.18
Changes since langchain-community==0.3.5
langchain,community[patch]: release with bumped core (#27854)
Added mapping to fix CI for #langchain-aws:227. (#27114)
community: poetry lock for cffi dep (#26674)
Bumps the pip group with 3 updates in the / directory: gradio, torch and langchain-community. Bumps the pip group with 3 updates in the /reqs_optional directory: gradio, torch and langchain-community. Bumps the pip group with 2 updates in the /spaces/demo directory: torch and transformers.
Updates
gradio
from 4.44.0 to 5.5.0Release notes
Sourced from gradio's releases.
Changelog
Sourced from gradio's changelog.
... (truncated)
Commits
b5eaba1
chore: update versions (#9874)fa5d433
Do not load code in gr.NO_RELOAD in the reload mode watch thread (#9886)b6725cf
Lite auto-load imported modules withpyodide.loadPackagesFromImports
(#9726)e10bbd2
Fix live interfaces for audio/image streaming (#9883)dcfa7ad
Enforcemeta
key present during preprocess in FileData payloads (#9898)7d77024
Fix dataframe height increasing on scroll (#9892)4d90883
Allows selection of directories in File Explorer (#9835)6c8a064
Ensure non-form elements are correctly positioned when scale is applied (#9882)a1582a6
Lite worker refactoring (#9424)f109497
Fix frontend errors on ApiDocs and RecordingSnippet (#9786)Updates
torch
from 2.2.1 to 2.5.1Release notes
Sourced from torch's releases.
... (truncated)
Commits
a8d6afb
Disabling amp context when invoking compiler (#138659)f31b8bb
[MPS] Fix sliced cast (#138535)848e7ac
[SDPA-CUDNN] Make CuDNN Attention Opt in (#138587)885c823
Update doc copyrights to 2024 (#138650)8c3ed97
Update cpuinfo submodule (#138600)70cf2bb
Add link to torch.compile the missing manual in troubleshooting (#137369)cde6b38
Don't try to load cufile (#138539)4076a73
[Cherry-Pick] Use cuda 12.4 pytorch_extra_install_requirements as default (#1...a97c151
update getting started xpu (#138090)32f585d
[Release only] use triton 3.1.x from pypi (#137895)Updates
langchain-community
from 0.2.6 to 0.2.19Release notes
Sourced from langchain-community's releases.
Commits
b9d892e
community: release 0.2.19 (#28057)64c317e
community: patch graphqa chains (CVE-2024-8309) (#28050)76e1dc7
langchain,community[patch]: release with bumped core (#27854)9fdeb74
core[patch]: Release 0.2.43 (#27808)7d481f1
core[patch]: remove prompt img loading (#27807)33a5397
infra: turn off release attestations (#27766)283cb50
core[patch]: Release 0.2.42 (#27763)5e3cee6
core[patch]: make get_all_basemodel_annotations public (#27762)8073146
Added mapping to fix CI for #langchain-aws:227. (#27114)6cfd1e8
core[patch]: Release 0.2.41 (#26687)Updates
gradio
from 4.44.0 to 5.5.0Release notes
Sourced from gradio's releases.
Changelog
Sourced from gradio's changelog.
... (truncated)
Commits
b5eaba1
chore: update versions (#9874)fa5d433
Do not load code in gr.NO_RELOAD in the reload mode watch thread (#9886)b6725cf
Lite auto-load imported modules withpyodide.loadPackagesFromImports
(#9726)e10bbd2
Fix live interfaces for audio/image streaming (#9883)dcfa7ad
Enforcemeta
key present during preprocess in FileData payloads (#9898)7d77024
Fix dataframe height increasing on scroll (#9892)4d90883
Allows selection of directories in File Explorer (#9835)6c8a064
Ensure non-form elements are correctly positioned when scale is applied (#9882)a1582a6
Lite worker refactoring (#9424)f109497
Fix frontend errors on ApiDocs and RecordingSnippet (#9786)Updates
torch
from 2.2.1 to 2.5.1Release notes
Sourced from torch's releases.
... (truncated)
Commits
a8d6afb
Disabling amp context when invoking compiler (#138659)f31b8bb
[MPS] Fix sliced cast (#138535)848e7ac
[SDPA-CUDNN] Make CuDNN Attention Opt in (#138587)885c823
Update doc copyrights to 2024 (#138650)8c3ed97
Update cpuinfo submodule (#138600)70cf2bb
Add link to torch.compile the missing manual in troubleshooting (#137369)cde6b38
Don't try to load cufile (#138539)4076a73
[Cherry-Pick] Use cuda 12.4 pytorch_extra_install_requirements as default (#1...a97c151
update getting started xpu (#138090)32f585d
[Release only] use triton 3.1.x from pypi (#137895)Updates
langchain-community
from 0.2.6 to 0.2.19Release notes
Sourced from langchain-community's releases.
Commits