This is a fairly large update. Mostly an accumulation of small fixes and enhancements. Nothing qualifies as a *breaking change (for some definition), but there may be some subtly changes to output. Check below for anything that might affect you!
[13.8.0] - 2024-08-26
Fixed
Fixed Table rendering of box elements so "footer" elements truly appear at bottom of table, "mid" elements in main table body.
[orm] [bug] Fixed regression caused by issue #11814 which broke support for
certain flavors of PEP 593Annotated in the type_annotation_map when
builtin types such as list, dict were used without an element type.
While this is an incomplete style of typing, these types nonetheless
previously would be located in the type_annotation_map correctly.
[sqlite] [bug] Fixed regression in SQLite reflection caused by #11677 which
interfered with reflection for CHECK constraints that were followed
by other kinds of constraints within the same table definition. Pull
request courtesy Harutaka Kawamura.
[general] [change] The pin for setuptools<69.3 in pyproject.toml has been removed.
This pin was to prevent a sudden change in setuptools to use PEP 625
from taking place, which would change the file name of SQLAlchemy's source
distribution on pypi to be an all lower case name, which is likely to cause
problems with various build environments that expected the previous naming
style. However, the presence of this pin is holding back environments that
otherwise want to use a newer setuptools, so we've decided to move forward
with this change, with the assumption that build environments will have
largely accommodated the setuptools change by now.
[orm] [bug] [regression] Fixed regression from 1.3 where the column key used for a hybrid property
might be populated with that of the underlying column that it returns, for
a property that returns an ORM mapped column directly, rather than the key
Fix Flux CLIP prompt embeds repeat for num_images_per_prompt > 1 by @DN6 in #9280
[IP Adapter] Fix cache_dir and local_files_only for image encoder by @asomoza in #9272
V0.30.1: CogVideoX-5B & Bug fixes
CogVideoX-5B
This patch release adds diffusers support for the upcoming CogVideoX-5B release! The model weights will be available next week on the Huggingface Hub at THUDM/CogVideoX-5b. Stay tuned for the release!
Additionally, we have implemented VAE tiling feature, which reduces the memory requirement for CogVideoX models. With this update, the total memory requirement is now 12GB for CogVideoX-2B and 21GB for CogVideoX-5B (with CPU offloading). To Enable this feature, simply call enable_tiling() on the VAE.
The code below shows how to generate a video with CogVideoX-5B
import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video
prompt = "Tracking shot,late afternoon light casting long shadows,a cyclist in athletic gear pedaling down a scenic mountain road,winding path with trees and a lake in the background,invigorating and adventurous atmosphere."
v0.34.0: StatefulDataLoader Support, FP8 Improvements, and PyTorch Updates!
Dependency Changes
Updated Safetensors Requirement: The library now requires safetensors version 0.4.3.
Added support for Numpy 2.0: The library now fully supports numpy 2.0.0
Core
New Script Behavior Changes
Process Group Management: PyTorch now requires users to destroy process groups after training. The accelerate library will handle this automatically with accelerator.end_training(), or you can do it manually using PartialState().destroy_process_group().
MLU Device Support: Added support for saving and loading RNG states on MLU devices by @huismiling
NPU Support: Corrected backend and distributed settings when using transfer_to_npu, ensuring better performance and compatibility.
DataLoader Enhancements
Stateful DataDataLoader: We are excited to announce that early support has been added for the StatefulDataLoader from torchdata, allowing better handling of data loading states. Enable by passing use_stateful_dataloader=True to the DataLoaderConfiguration, and when calling load_state() the DataLoader will automatically be resumed from its last step, no more having to iterate through passed batches.
Decoupled Data Loader Preparation: The prepare_data_loader() function is now independent of the Accelerator, giving you more flexibility towards which API levels you would like to use.
XLA Compatibility: Added support for skipping initial batches when using XLA.
Improved State Management: Bug fixes and enhancements for saving/loading DataLoader states, ensuring smoother training sessions.
Epoch Setting: Introduced the set_epoch function for MpDeviceLoaderWrapper.
FP8 Training Improvements
Enhanced FP8 Training: Fully Sharded Data Parallelism (FSDP) and DeepSpeed support now work seamlessly with TransformerEngine FP8 training, including better defaults for the quantized FP8 weights.
Integration baseline: We've added a new suite of examples and benchmarks to ensure that our TransformerEngine integration works exactly as intended. These scripts run one half using 🤗 Accelerate's integration, the other with raw TransformersEngine, providing users with a nice example of what we do under the hood with accelerate, and a good sanity check to make sure nothing breaks down over time. Find them here
Import Fixes: Resolved issues with import checks for the Transformers Engine that has downstream issues.
FP8 Docker Images: We've added new docker images for TransformerEngine and accelerate as well. Use docker pull huggingface/accelerate@gpu-fp8-transformerengine to quickly get an environment going.
torchpippy no more, long live torch.distributed.pipelining
With the latest PyTorch release, torchpippy is now fully integrated into torch core, and as a result we are exclusively supporting the PyTorch implementation from now on
There are breaking examples and changes that comes from this shift. Namely:
Tracing of inputs is done with a shape each GPU will see, rather than the size of the total batch. So for 2 GPUs, one should pass in an input of [1, n, n] rather than [2, n, n] as before.
We no longer support Encoder/Decoder models. PyTorch tracing for pipelining no longer supports encoder/decoder models, so the t5 example has been removed.
Computer vision model support currently does not work: There are some tracing issues regarding resnet's we are actively looking into.
If either of these changes are too breaking, we recommend pinning your accelerate version. If the encoder/decoder model support is actively blocking your inference using pippy, please open an issue and let us know. We can look towards adding in the old support for torchpippy potentially if needed.
Fully Sharded Data Parallelism (FSDP)
Environment Flexibility: Environment variables are now fully optional for FSDP, simplifying configuration. You can now fully create a FullyShardedDataParallelPlugin yourself manually with no need for environment patching:
from accelerate import FullyShardedDataParallelPlugin
fsdp_plugin = FullyShardedDataParallelPlugin(...)
FSDP RAM efficient loading: Added a utility to enable RAM-efficient model loading (by setting the proper environmental variable). This is generally needed if not using accelerate launch and need to ensure the env variables are setup properly for model loading:
from accelerate.utils import enable_fsdp_ram_efficient_loading, disable_fsdp_ram_efficient_loading
enable_fsdp_ram_efficient_loading()
Model State Dict Management: Enhanced support for unwrapping model state dicts in FSDP, making it easier to manage distributed models.
New Examples
Configuration and Models: Improved configuration handling and introduced a configuration zoo for easier experimentation. You can learn more here. This was largely inspired by the axolotl library, so very big kudos to their wonderful work
This release is meant to fix the following issues (regressions / silent correctness):
Breaking Changes:
The pytorch/pytorch docker image now installs the PyTorch package through pip and has switch its conda installation from miniconda to miniforge (#134274)
Windows:
Fix performance regression on Windows related to MKL static linking (#130619) (#130697)
Fix error during loading on Windows: [WinError 126] The specified module could not be found. (#131662) (#130697)
Fix error on Windows with CPU inference (#131958) (#130697)
Bumps the dependencies group with 15 updates:
1.42.0
1.43.0
13.7.1
13.8.0
5.8.1
5.9.0
4.44.1
4.44.2
1.35.2
1.35.12
2.7.0
2.7.1
0.2.5
0.3.2
0.11.6
0.13.1
1.11.0
1.11.1
0.3.1
0.3.2
6.2.10
6.2.11
2.0.32
2.0.34
0.29.2
0.30.2
0.32.1
0.34.0
2.4.0
2.4.1
Updates
openai
from 1.42.0 to 1.43.0Release notes
Sourced from openai's releases.
Changelog
Sourced from openai's changelog.
Commits
9850c16
release: 1.43.05d3111a
feat(api): add file search result details to run steps (#1681)Updates
rich
from 13.7.1 to 13.8.0Release notes
Sourced from rich's releases.
Changelog
Sourced from rich's changelog.
Commits
9ec4191
Merge pull request #3473 from Textualize/bump13809c74f03
bump to v13.8.0dc7a195
Merge pull request #3472 from Textualize/fix-bad-dataclassc938830
changelog6055e2d
fix for missing field in dataclassb6f2f7a
Merge pull request #3454 from subrat-lima/masterb1397be
Merge pull request #3455 from jjhelmus/dataclasses_3.13035f3ea
Merge pull request #3452 from sbraz/typos_examplesd6abebd
Merge branch 'master' into dataclasses_3.131b2dada
Merge pull request #3471 from Textualize/fix-append-tokensUpdates
cohere
from 5.8.1 to 5.9.0Updates
transformers
from 4.44.1 to 4.44.2Release notes
Sourced from transformers's releases.
Commits
1748902
v4.44.26845144
Fix regression onProcessor.save_pretrained
caused by #31691 (#32921)3d8cba8
fix: no need to dtype A in jamba (#32924)c1df7f8
fix: jamba cache fails to use torch.nn.module (#32894)Updates
boto3
from 1.35.2 to 1.35.12Commits
882adeb
Merge branch 'release-1.35.12'ea29dd5
Bumping version to 1.35.12c071b3a
Add changelog entries from botocore0f43c75
Merge branch 'release-1.35.11'70c7ec4
Merge branch 'release-1.35.11' into develop7ab2158
Bumping version to 1.35.119569713
Add changelog entries from botocore7f86817
Merge pull request #4254 from boto/dependabot/github_actions/actions/setup-py...a9d4593
Bump actions/setup-python from 5.1.0 to 5.2.085cba10
Merge branch 'release-1.35.10'Updates
opensearch-py
from 2.7.0 to 2.7.1Release notes
Sourced from opensearch-py's releases.
Changelog
Sourced from opensearch-py's changelog.
Commits
d54aab4
Preparing for 2.7.1 release. (#805)e26f3fc
Add override for code generator to changeindices.put_alias
argument order ...b994dc4
Preparing for next developer iteration, 2.7.1. (#802)Updates
pgvector
from 0.2.5 to 0.3.2Changelog
Sourced from pgvector's changelog.
Commits
4f721eb
Version bump to 0.3.2 [skip ci]50fbcab
Added todo [skip ci]4498caa
Added test for Bit constructor with uint8e77ee13
Updated publish task [skip ci]4672a4b
Fixed error with asyncpg and pgvector < 0.7 - fixes #83b87552b
Improved example [skip ci]9c98a2d
Added todo [skip ci]182fe84
Added Cohere example [skip ci]13450b0
Version bump to 0.3.1 [skip ci]fcd7f61
Fixed backwards compatibility of type info query for Psycopg 2Updates
markdownify
from 0.11.6 to 0.13.1Release notes
Sourced from markdownify's releases.
Commits
b5c724a
Merge branch 'develop'964d89f
bump to version v0.13.146dc1a0
Migrated the metadata into PEP 621-compliant pyproject.toml (#138)8c810eb
Merge branch 'develop'f6c8daf
bump to v0.13.075a678d
fix pytest version to 80a5c89a
added test for ol start check51390d7
handle ol start value is not number (#127)50b4640
better naming for markup variables7861b33
Special-case use of HTML tags for converting\<sub>
/\<sup>
(#119)Updates
qdrant-client
from 1.11.0 to 1.11.1Release notes
Sourced from qdrant-client's releases.
Commits
fa6b201
bump version to v1.11.1e782394
fix: fix typo in splade model name (#754)d3643ca
new: update fastembed to 0.3.6 (#753)388093c
improved semver comparison (#736)3379a89
fix: fix conversion in create_payload_index, add tests (#750)bbbfbcd
fix: fix modifier.none conversion (#748)15acb14
deprecate: mark async gRPC properties as deprecated in QdrantRemote (#740)dc21682
deprecate: mark rest property as deprecated in QdrantClient (#742)d02bd4e
fix: handle type_params for Python 3.12+ (#739)Updates
ollama
from 0.3.1 to 0.3.2Release notes
Sourced from ollama's releases.
Commits
d98f646
IPv6 support (#262)981015c
Merge pull request #261 from ollama/dependabot/pip/ruff-0.6.29c34d81
Bump ruff from 0.5.5 to 0.6.29f2832d
Merge pull request #260 from ollama/dependabot/pip/pytest-asyncio-0.24.0e220e46
Merge pull request #252 from ollama/dependabot/pip/pytest-httpserver-1.1.0dfdeb7c
Add URL path to client URL in in Client._parse_host() (#170)9e6726e
Bump pytest-asyncio from 0.23.8 to 0.24.010d0ff2
Bump pytest-httpserver from 1.0.12 to 1.1.0Updates
duckduckgo-search
from 6.2.10 to 6.2.11Release notes
Sourced from duckduckgo-search's releases.
Commits
57dcddb
README: update install section (bump version with httpx to v6.2.11b1)91fcbf5
Bump version to v6.2.1195711eb
[utils] bugfix _normalize_url() - replace space after unquote5c0e9ec
README: proxy example section - add couponUpdates
sqlalchemy
from 2.0.32 to 2.0.34Release notes
Sourced from sqlalchemy's releases.
... (truncated)
Commits
Updates
diffusers
from 0.29.2 to 0.30.2Release notes
Sourced from diffusers's releases.
... (truncated)
Commits
f63c126
Release: v0.30.2be5995a
update runway repo for single_file (#9323)0659784
Fix Flux CLIP prompt embeds repeat for num_images_per_prompt > 1 (#9280)cc1e589
[IP Adapter] Fixcache_dir
andlocal_files_only
for image encoder (#9272)8b9bfae
Release v0.30.1b12c7f8
[Single File] Support loading Comfy UI Flux checkpoints (#9243)06f3671
Cogvideox-5B Model adapter change (#9203)19c5d7b
[tests] fix broken xformers tests (#9206)99a64aa
[Flux LoRA] support parsing alpha from a flux lora state dict. (#9236)1bb4196
[Single File] Fix configuring scheduler via legacy kwargs (#9229)Updates
accelerate
from 0.32.1 to 0.34.0Release notes
Sourced from accelerate's releases.
... (truncated)
Commits
159c0dd
Release: v0.34.08931e5e
Removeskip_first_batches
support for StatefulDataloader and fix all the te...a848592
Speed up tests by shaving off subprocess when not needed (#3042)758d624
add set_epoch for MpDeviceLoaderWrapper (#3053)b07ad2a
Fix typo in comment (#3045)1d09a20
use duck-typing to ensure underlying optimizer supports schedulefree hooks (#...3fcc946
Do not importtransformer_engine
on import (#3056)939ce40
Update torchpippy (#2938)c212092
Add FP8 docker images (#3048)654e1d9
Add a SLURM example with minimal config (#2950)Updates
torch
from 2.4.0 to 2.4.1Release notes
Sourced from torch's releases.