allenporter / llama-cpp-server

Docker images for easier running of llama-cpp-python server
Apache License 2.0
5 stars 2 forks source link

Update dependency transformers to v4.43.3 #105

Closed renovate[bot] closed 3 months ago

renovate[bot] commented 3 months ago

Mend Renovate

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
transformers ==4.42.4 -> ==4.43.3 age adoption passing confidence

Release Notes

huggingface/transformers (transformers) ### [`v4.43.3`](https://togithub.com/huggingface/transformers/releases/tag/v4.43.3): Patch deepspeed [Compare Source](https://togithub.com/huggingface/transformers/compare/v4.43.2...v4.43.3) Patch release v4.43.3: We still saw some bugs so [@​zucchini-nlp](https://togithub.com/zucchini-nlp) added: - Resize embeds with DeepSpeed [#​32214](https://togithub.com/huggingface/transformers/issues/32214) Other fixes: - \[whisper] fix short-form output type [#​32178](https://togithub.com/huggingface/transformers/issues/32178), by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) which fixes the short audio temperature fallback! - \[BigBird Pegasus] set \_supports_param_buffer_assignment to False [#​32222](https://togithub.com/huggingface/transformers/issues/32222) by [@​kashif](https://togithub.com/kashif), mostly related to the new super fast init, some models have to get this set to False. If you see a weird behavior look for that 😉 ### [`v4.43.2`](https://togithub.com/huggingface/transformers/releases/tag/v4.43.2): : Patch release [Compare Source](https://togithub.com/huggingface/transformers/compare/v4.43.1...v4.43.2) - Fix float8\_e4m3fn in modeling_utils ([#​32193](https://togithub.com/huggingface/transformers/issues/32193)) - Fix resize embedding with Deepspeed ([#​32192](https://togithub.com/huggingface/transformers/issues/32192)) - let's not warn when someone is running a forward ([#​32176](https://togithub.com/huggingface/transformers/issues/32176)) - RoPE: relaxed rope validation ([#​32182](https://togithub.com/huggingface/transformers/issues/32182)) ### [`v4.43.1`](https://togithub.com/huggingface/transformers/releases/tag/v4.43.1): : Patch release [Compare Source](https://togithub.com/huggingface/transformers/compare/v4.43.0...v4.43.1) - fix ([#​32162](https://togithub.com/huggingface/transformers/issues/32162)) ### [`v4.43.0`](https://togithub.com/huggingface/transformers/releases/tag/v4.43.0): : Llama 3.1, Chameleon, ZoeDepth, Hiera [Compare Source](https://togithub.com/huggingface/transformers/compare/v4.42.4...v4.43.0) #### Llama The Llama 3.1 models are released by Meta and come in three flavours: 8B, 70B, and 405B. To get an overview of Llama 3.1, please visit the [Hugging Face announcement blog post](https://huggingface.co/blog/llama31). We release a [repository of llama recipes](https://togithub.com/huggingface/huggingface-llama-recipes) to showcase usage for inference, total and partial fine-tuning of the different variants. ![image](https://togithub.com/user-attachments/assets/4b5bf1e0-647c-428d-8f88-691bc343c53d) #### Chameleon The Chameleon model was proposed in [Chameleon: Mixed-Modal Early-Fusion Foundation Models](https://arxiv.org/abs/2405.09818v1) by META AI Chameleon Team. Chameleon is a Vision-Language Model that use vector quantization to tokenize images which enables the model to generate multimodal output. The model takes images and texts as input, including an interleaved format, and generates textual response. - Chameleon: add model by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​31534](https://togithub.com/huggingface/transformers/issues/31534) #### ZoeDepth The ZoeDepth model was proposed in [ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth](https://arxiv.org/abs/2302.12288) by Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, Matthias Müller. ZoeDepth extends the [DPT](https://huggingface.co/docs/transformers/main/en/model_doc/dpt) framework for metric (also called absolute) depth estimation. ZoeDepth is pre-trained on 12 datasets using relative depth and fine-tuned on two domains (NYU and KITTI) using metric depth. A lightweight head is used with a novel bin adjustment design called metric bins module for each domain. During inference, each input image is automatically routed to the appropriate head using a latent classifier. - Add ZoeDepth by [@​NielsRogge](https://togithub.com/NielsRogge) in [#​30136](https://togithub.com/huggingface/transformers/issues/30136) #### Hiera Hiera was proposed in [Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles](https://arxiv.org/abs/2306.00989) by Chaitanya Ryali, Yuan-Ting Hu, Daniel Bolya, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer The paper introduces “Hiera,” a hierarchical Vision Transformer that simplifies the architecture of modern hierarchical vision transformers by removing unnecessary components without compromising on accuracy or efficiency. Unlike traditional transformers that add complex vision-specific components to improve supervised classification performance, Hiera demonstrates that such additions, often termed “bells-and-whistles,” are not essential for high accuracy. By leveraging a strong visual pretext task (MAE) for pretraining, Hiera retains simplicity and achieves superior accuracy and speed both in inference and training across various image and video recognition tasks. The approach suggests that spatial biases required for vision tasks can be effectively learned through proper pretraining, eliminating the need for added architectural complexity. - Adding hiera by [@​Namangarg110](https://togithub.com/Namangarg110) in [#​30356](https://togithub.com/huggingface/transformers/issues/30356) #### Agents Our ReactAgent has a specific way to return its final output: it calls the tool final_answer, added to the user-defined toolbox upon agent initialization, with the answer as the tool argument. We found that even for a one-shot agent like CodeAgent, using a specific final_answer tools helps the llm_engine find what to return: so we generalized the final_answer tool for all agents. - Adds final answer tool for all agents by [@​aymeric-roucher](https://togithub.com/aymeric-roucher) in [#​31703](https://togithub.com/huggingface/transformers/issues/31703) Now if your code-based agent (like ReactCodeAgent) defines a function at step 1, it will remember the function definition indefinitely. This means your agent can create its own tools for later re-use! - Code agent: allow function persistence between steps by [@​aymeric-roucher](https://togithub.com/aymeric-roucher) in [#​31769](https://togithub.com/huggingface/transformers/issues/31769) This is a transformative PR: it allows the agent to regularly run a specific step for planning its actions in advance. This gets activated if you set an int for planning_interval upon agent initialization. At step 0, a first plan will be done. At later steps (like steps 3, 6, 9 if you set planning_interval=3 ), this plan will be updated by the agent depending on the history of previous steps. More detail soon! - Agents planning by [@​aymeric-roucher](https://togithub.com/aymeric-roucher) in [#​31702](https://togithub.com/huggingface/transformers/issues/31702) #### Notable changes to the codebase A significant RoPE refactor was done to make it model agnostic and more easily adaptable to any architecture. It is only applied to Llama for now but will be applied to all models using RoPE over the coming days. - Llama: RoPE refactor by [@​gante](https://togithub.com/gante) in [#​32135](https://togithub.com/huggingface/transformers/issues/32135) #### Breaking changes ##### TextGenerationPipeline and tokenizer kwargs 🚨🚨 This PR changes the code to rely on the tokenizer's defaults when these flags are unset. This means some models using `TextGenerationPipeline` previously did not add a `` by default, which (negatively) impacted their performance. In practice, this is a breaking change. Example of a script changed as a result of this PR: ```py from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline import torch tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-9b-it") model = AutoModelForCausalLM.from_pretrained("google/gemma-2-9b-it", torch_dtype=torch.bfloat16, device_map="auto") pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) print(pipe("Foo bar")) ``` - 🚨🚨 TextGenerationPipeline: rely on the tokenizer default kwargs by [@​gante](https://togithub.com/gante) in [#​31747](https://togithub.com/huggingface/transformers/issues/31747) #### Bugfixes and improvements - Fix post gemma merge by [@​ArthurZucker](https://togithub.com/ArthurZucker) in [#​31660](https://togithub.com/huggingface/transformers/issues/31660) - Fix float out of range in owlvit and owlv2 when using FP16 or lower precision by [@​aliencaocao](https://togithub.com/aliencaocao) in [#​31657](https://togithub.com/huggingface/transformers/issues/31657) - \[docs] Llama3 by [@​stevhliu](https://togithub.com/stevhliu) in [#​31662](https://togithub.com/huggingface/transformers/issues/31662) - \[HybridCache] Fix `get_seq_length` method by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) in [#​31661](https://togithub.com/huggingface/transformers/issues/31661) - don't zero out the attention_mask when using sliding window with flash attention by [@​winglian](https://togithub.com/winglian) in [#​31670](https://togithub.com/huggingface/transformers/issues/31670) - Fix Gemma2 4d attention mask by [@​hiyouga](https://togithub.com/hiyouga) in [#​31674](https://togithub.com/huggingface/transformers/issues/31674) - Fix return_dict in encodec by [@​jla524](https://togithub.com/jla524) in [#​31646](https://togithub.com/huggingface/transformers/issues/31646) - add gather_use_object arguments by [@​SangbumChoi](https://togithub.com/SangbumChoi) in [#​31514](https://togithub.com/huggingface/transformers/issues/31514) - Gemma capping is a must for big models by [@​ArthurZucker](https://togithub.com/ArthurZucker) in [#​31698](https://togithub.com/huggingface/transformers/issues/31698) - Add French version of run scripts tutorial by [@​jadechoghari](https://togithub.com/jadechoghari) in [#​31483](https://togithub.com/huggingface/transformers/issues/31483) - dependencies: `keras-nlp<0.14` pin by [@​gante](https://togithub.com/gante) in [#​31684](https://togithub.com/huggingface/transformers/issues/31684) - remove incorrect urls pointing to the llava repository by [@​BiliBraker](https://togithub.com/BiliBraker) in [#​31107](https://togithub.com/huggingface/transformers/issues/31107) - Move some test files (`tets/test_xxx_utils.py`) to `tests/utils` by [@​ydshieh](https://togithub.com/ydshieh) in [#​31730](https://togithub.com/huggingface/transformers/issues/31730) - Fix mistral ONNX export by [@​fxmarty](https://togithub.com/fxmarty) in [#​31696](https://togithub.com/huggingface/transformers/issues/31696) - \[whisper] static kv cache by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) in [#​31166](https://togithub.com/huggingface/transformers/issues/31166) - Make tool JSON schemas consistent by [@​Rocketknight1](https://togithub.com/Rocketknight1) in [#​31756](https://togithub.com/huggingface/transformers/issues/31756) - Fix documentation for Gemma2. by [@​jbornschein](https://togithub.com/jbornschein) in [#​31682](https://togithub.com/huggingface/transformers/issues/31682) - fix assisted decoding by [@​jiqing-feng](https://togithub.com/jiqing-feng) in [#​31401](https://togithub.com/huggingface/transformers/issues/31401) - Requires for torch.tensor before casting by [@​echarlaix](https://togithub.com/echarlaix) in [#​31755](https://togithub.com/huggingface/transformers/issues/31755) - handle (processor_class, None) returned by ModelPatterns by [@​molbap](https://togithub.com/molbap) in [#​31753](https://togithub.com/huggingface/transformers/issues/31753) - Gemma 2: Update slow tests by [@​gante](https://togithub.com/gante) in [#​31759](https://togithub.com/huggingface/transformers/issues/31759) - Add ignore_errors=True to trainer.py rmtree in \_inner_training_loop by [@​njbrake](https://togithub.com/njbrake) in [#​31668](https://togithub.com/huggingface/transformers/issues/31668) - \[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics by [@​wiserxin](https://togithub.com/wiserxin) in [#​31447](https://togithub.com/huggingface/transformers/issues/31447) - Fix RT-DETR cache for generate_anchors by [@​qubvel](https://togithub.com/qubvel) in [#​31671](https://togithub.com/huggingface/transformers/issues/31671) - Fix RT-DETR weights initialization by [@​qubvel](https://togithub.com/qubvel) in [#​31724](https://togithub.com/huggingface/transformers/issues/31724) - `pytest_num_workers=4` for some CircleCI jobs by [@​ydshieh](https://togithub.com/ydshieh) in [#​31764](https://togithub.com/huggingface/transformers/issues/31764) - Fix Gemma2 types by [@​hiyouga](https://togithub.com/hiyouga) in [#​31779](https://togithub.com/huggingface/transformers/issues/31779) - Add torch_empty_cache_steps to TrainingArguments by [@​aliencaocao](https://togithub.com/aliencaocao) in [#​31546](https://togithub.com/huggingface/transformers/issues/31546) - Fix ClapProcessor to merge feature_extractor output into the returned BatchEncoding by [@​mxkopy](https://togithub.com/mxkopy) in [#​31767](https://togithub.com/huggingface/transformers/issues/31767) - Fix serialization for offloaded model by [@​SunMarc](https://togithub.com/SunMarc) in [#​31727](https://togithub.com/huggingface/transformers/issues/31727) - Make tensor device correct when ACCELERATE_TORCH_DEVICE is defined by [@​kiszk](https://togithub.com/kiszk) in [#​31751](https://togithub.com/huggingface/transformers/issues/31751) - Exclude torch.compile time from metrics computation by [@​zxd1997066](https://togithub.com/zxd1997066) in [#​31443](https://togithub.com/huggingface/transformers/issues/31443) - Update CometCallback to allow reusing of the running experiment by [@​Lothiraldan](https://togithub.com/Lothiraldan) in [#​31366](https://togithub.com/huggingface/transformers/issues/31366) - Fix gemma tests by [@​ydshieh](https://togithub.com/ydshieh) in [#​31794](https://togithub.com/huggingface/transformers/issues/31794) - Add training support for SigLIP by [@​aliencaocao](https://togithub.com/aliencaocao) in [#​31495](https://togithub.com/huggingface/transformers/issues/31495) - Repeating an important warning in the chat template docs by [@​Rocketknight1](https://togithub.com/Rocketknight1) in [#​31796](https://togithub.com/huggingface/transformers/issues/31796) - Allow FP16 or other precision inference for Pipelines by [@​aliencaocao](https://togithub.com/aliencaocao) in [#​31342](https://togithub.com/huggingface/transformers/issues/31342) - Fix galore lr display with schedulers by [@​vasqu](https://togithub.com/vasqu) in [#​31710](https://togithub.com/huggingface/transformers/issues/31710) - Fix Wav2Vec2 Fairseq conversion (weight norm state dict keys) by [@​gau-nernst](https://togithub.com/gau-nernst) in [#​31714](https://togithub.com/huggingface/transformers/issues/31714) - Depth Anything: update conversion script for V2 by [@​pcuenca](https://togithub.com/pcuenca) in [#​31522](https://togithub.com/huggingface/transformers/issues/31522) - Fix Seq2SeqTrainer crash when BatchEncoding data is None by [@​iohub](https://togithub.com/iohub) in [#​31418](https://togithub.com/huggingface/transformers/issues/31418) - Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/decision_transformer by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​31813](https://togithub.com/huggingface/transformers/issues/31813) - Add FA2 and `sdpa` support for SigLIP by [@​qubvel](https://togithub.com/qubvel) in [#​31499](https://togithub.com/huggingface/transformers/issues/31499) - Bump transformers from 4.26.1 to 4.38.0 in /examples/tensorflow/language-modeling-tpu by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​31837](https://togithub.com/huggingface/transformers/issues/31837) - Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/lxmert by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​31838](https://togithub.com/huggingface/transformers/issues/31838) - Fix typos by [@​omahs](https://togithub.com/omahs) in [#​31819](https://togithub.com/huggingface/transformers/issues/31819) - transformers.fx.symbolic_trace supports inputs_embeds by [@​fxmarty](https://togithub.com/fxmarty) in [#​31574](https://togithub.com/huggingface/transformers/issues/31574) - Avoid failure `TFBlipModelTest::test_pipeline_image_to_text` by [@​ydshieh](https://togithub.com/ydshieh) in [#​31827](https://togithub.com/huggingface/transformers/issues/31827) - Fix incorrect accelerator device handling for MPS in `TrainingArguments` by [@​andstor](https://togithub.com/andstor) in [#​31812](https://togithub.com/huggingface/transformers/issues/31812) - Mamba & RecurrentGemma: enable strict signature by [@​gante](https://togithub.com/gante) in [#​31549](https://togithub.com/huggingface/transformers/issues/31549) - Deprecate `vocab_size` in other two VLMs by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​31681](https://togithub.com/huggingface/transformers/issues/31681) - FX symbolic_trace: do not test decoder_inputs_embeds by [@​fxmarty](https://togithub.com/fxmarty) in [#​31840](https://togithub.com/huggingface/transformers/issues/31840) - \[Grounding DINO] Add processor to auto mapping by [@​NielsRogge](https://togithub.com/NielsRogge) in [#​31845](https://togithub.com/huggingface/transformers/issues/31845) - chore: remove duplicate words by [@​hattizai](https://togithub.com/hattizai) in [#​31853](https://togithub.com/huggingface/transformers/issues/31853) - save_pretrained: use tqdm when saving checkpoint shards from offloaded params by [@​kallewoof](https://togithub.com/kallewoof) in [#​31856](https://togithub.com/huggingface/transformers/issues/31856) - Test loading generation config with safetensor weights by [@​gante](https://togithub.com/gante) in [#​31550](https://togithub.com/huggingface/transformers/issues/31550) - docs: typo in tf qa example by [@​chen-keinan](https://togithub.com/chen-keinan) in [#​31864](https://togithub.com/huggingface/transformers/issues/31864) - Generate: Add new decoding strategy "DoLa" in `.generate()` by [@​voidism](https://togithub.com/voidism) in [#​29619](https://togithub.com/huggingface/transformers/issues/29619) - Fix `_init_weights` for `ResNetPreTrainedModel` by [@​ydshieh](https://togithub.com/ydshieh) in [#​31851](https://togithub.com/huggingface/transformers/issues/31851) - Update depth estimation task guide by [@​merveenoyan](https://togithub.com/merveenoyan) in [#​31860](https://togithub.com/huggingface/transformers/issues/31860) - Bump zipp from 3.7.0 to 3.19.1 in /examples/research_projects/decision_transformer by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​31871](https://togithub.com/huggingface/transformers/issues/31871) - Add return type annotation to PreTrainedModel.from_pretrained by [@​mauvilsa](https://togithub.com/mauvilsa) in [#​31869](https://togithub.com/huggingface/transformers/issues/31869) - Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" by [@​ydshieh](https://togithub.com/ydshieh) in [#​31868](https://togithub.com/huggingface/transformers/issues/31868) - Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/visual_bert by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​31872](https://togithub.com/huggingface/transformers/issues/31872) - add warning when using gradient_checkpointing with FSDP full shard by [@​yundai424](https://togithub.com/yundai424) in [#​31578](https://togithub.com/huggingface/transformers/issues/31578) - Add conversion for interleave llava by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​31858](https://togithub.com/huggingface/transformers/issues/31858) - remove duplicate words in msg by [@​yukionfire](https://togithub.com/yukionfire) in [#​31876](https://togithub.com/huggingface/transformers/issues/31876) - Fix file type checks in data splits for contrastive training example script by [@​npyoung](https://togithub.com/npyoung) in [#​31720](https://togithub.com/huggingface/transformers/issues/31720) - Fix failed tests in [#​31851](https://togithub.com/huggingface/transformers/issues/31851) by [@​ydshieh](https://togithub.com/ydshieh) in [#​31879](https://togithub.com/huggingface/transformers/issues/31879) - fix: Removed `duplicate` field definitions in some classes by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​31888](https://togithub.com/huggingface/transformers/issues/31888) - Push sharded checkpoint to hub when `push_to_hub=True` in `TrainingArguments` by [@​SunMarc](https://togithub.com/SunMarc) in [#​31808](https://togithub.com/huggingface/transformers/issues/31808) - \[RT-DETR] Add resources by [@​NielsRogge](https://togithub.com/NielsRogge) in [#​31815](https://togithub.com/huggingface/transformers/issues/31815) - Modify `warnings` in a `with` block to avoid flaky tests by [@​ydshieh](https://togithub.com/ydshieh) in [#​31893](https://togithub.com/huggingface/transformers/issues/31893) - Add a condition for nested_detach by [@​haikuoxin](https://togithub.com/haikuoxin) in [#​31855](https://togithub.com/huggingface/transformers/issues/31855) - InstructBlipVideo: Update docstring by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​31886](https://togithub.com/huggingface/transformers/issues/31886) - Fixes to alternating SWA layers in Gemma2 by [@​turboderp](https://togithub.com/turboderp) in [#​31775](https://togithub.com/huggingface/transformers/issues/31775) - Processor accepts any kwargs by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​31889](https://togithub.com/huggingface/transformers/issues/31889) - \[`ConvertSlow`] make sure the order is preserved for addedtokens by [@​ArthurZucker](https://togithub.com/ArthurZucker) in [#​31902](https://togithub.com/huggingface/transformers/issues/31902) - \[`Gemma2`] Support FA2 softcapping by [@​ArthurZucker](https://togithub.com/ArthurZucker) in [#​31887](https://togithub.com/huggingface/transformers/issues/31887) - Fix missing methods for Fuyu by [@​Isotr0py](https://togithub.com/Isotr0py) in [#​31880](https://togithub.com/huggingface/transformers/issues/31880) - fix: Fixed the `1st argument` name in classmethods by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​31907](https://togithub.com/huggingface/transformers/issues/31907) - add gather_use_object arguments II by [@​SangbumChoi](https://togithub.com/SangbumChoi) in [#​31799](https://togithub.com/huggingface/transformers/issues/31799) - Add warning message for beta and gamma parameters by [@​OmarManzoor](https://togithub.com/OmarManzoor) in [#​31654](https://togithub.com/huggingface/transformers/issues/31654) - Fix fx tests with inputs_embeds by [@​fxmarty](https://togithub.com/fxmarty) in [#​31862](https://togithub.com/huggingface/transformers/issues/31862) - Refactor flash attention implementation in transformers by [@​ArthurZucker](https://togithub.com/ArthurZucker) in [#​31446](https://togithub.com/huggingface/transformers/issues/31446) - Generate: fix `SlidingWindowCache.reset()` by [@​gante](https://togithub.com/gante) in [#​31917](https://togithub.com/huggingface/transformers/issues/31917) - 🚨 fix(SigLip): remove spurious exclusion of first vision output token by [@​transmissions11](https://togithub.com/transmissions11) in [#​30952](https://togithub.com/huggingface/transformers/issues/30952) - Allow `Trainer.get_optimizer_cls_and_kwargs` to be overridden by [@​apoorvkh](https://togithub.com/apoorvkh) in [#​31875](https://togithub.com/huggingface/transformers/issues/31875) - \[Bug Fix] fix qa pipeline tensor to numpy by [@​jiqing-feng](https://togithub.com/jiqing-feng) in [#​31585](https://togithub.com/huggingface/transformers/issues/31585) - Docker: TF pin on the consistency job by [@​gante](https://togithub.com/gante) in [#​31928](https://togithub.com/huggingface/transformers/issues/31928) - fix prompt strip to support tensors and np arrays by [@​AvivSham](https://togithub.com/AvivSham) in [#​27818](https://togithub.com/huggingface/transformers/issues/27818) - Fix `GenerationMixin.generate` compatibility with pytorch profiler by [@​fxmarty](https://togithub.com/fxmarty) in [#​31935](https://togithub.com/huggingface/transformers/issues/31935) - Generate: remove deprecated code due to `Cache` and `cache_position` being default by [@​gante](https://togithub.com/gante) in [#​31898](https://togithub.com/huggingface/transformers/issues/31898) - Generate: v4.42 deprecations 🧹🧹 by [@​gante](https://togithub.com/gante) in [#​31956](https://togithub.com/huggingface/transformers/issues/31956) - Whisper: move to tensor cpu before converting to np array at decode time by [@​gante](https://togithub.com/gante) in [#​31954](https://togithub.com/huggingface/transformers/issues/31954) - fix: Removed a wrong key-word argument in `sigmoid_focal_loss()` function call by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​31951](https://togithub.com/huggingface/transformers/issues/31951) - Generate: handle `logits_warper` update in models with custom generate fn by [@​gante](https://togithub.com/gante) in [#​31957](https://togithub.com/huggingface/transformers/issues/31957) - fix: Fixed the arguments in `create_repo()` function call by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​31947](https://togithub.com/huggingface/transformers/issues/31947) - Notify new docker images built for circleci by [@​ydshieh](https://togithub.com/ydshieh) in [#​31701](https://togithub.com/huggingface/transformers/issues/31701) - Avoid race condition by [@​ydshieh](https://togithub.com/ydshieh) in [#​31973](https://togithub.com/huggingface/transformers/issues/31973) - Masking: remove flakiness from test by [@​gante](https://togithub.com/gante) in [#​31939](https://togithub.com/huggingface/transformers/issues/31939) - Generate: doc nits by [@​gante](https://togithub.com/gante) in [#​31982](https://togithub.com/huggingface/transformers/issues/31982) - Fix the incorrect permutation of gguf by [@​PenutChen](https://togithub.com/PenutChen) in [#​31788](https://togithub.com/huggingface/transformers/issues/31788) - Cambricon MLUs support SDPA and flash_attn by [@​huismiling](https://togithub.com/huismiling) in [#​31102](https://togithub.com/huggingface/transformers/issues/31102) - Speedup model init on CPU (by 10x+ for llama-3-8B as one example) by [@​muellerzr](https://togithub.com/muellerzr) in [#​31771](https://togithub.com/huggingface/transformers/issues/31771) - \[tests] fix deepspeed zero3 config for `test_stage3_nvme_offload` by [@​faaany](https://togithub.com/faaany) in [#​31881](https://togithub.com/huggingface/transformers/issues/31881) - Fix bad test about slower init by [@​muellerzr](https://togithub.com/muellerzr) in [#​32002](https://togithub.com/huggingface/transformers/issues/32002) - Tests: remove cuda versions when the result is the same 🧹🧹 by [@​gante](https://togithub.com/gante) in [#​31955](https://togithub.com/huggingface/transformers/issues/31955) - Bug report update by [@​gante](https://togithub.com/gante) in [#​31983](https://togithub.com/huggingface/transformers/issues/31983) - add flash-attn deterministic option to flash-attn>=2.4.1 by [@​junrae6454](https://togithub.com/junrae6454) in [#​31961](https://togithub.com/huggingface/transformers/issues/31961) - fix: Fixed incorrect dictionary assignment in `src/transformers/__init__.py` by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​31993](https://togithub.com/huggingface/transformers/issues/31993) - Bug report update -- round 2 by [@​gante](https://togithub.com/gante) in [#​32006](https://togithub.com/huggingface/transformers/issues/32006) - Fix gather when collecting 'num_input_tokens_seen' by [@​CodeCreator](https://togithub.com/CodeCreator) in [#​31974](https://togithub.com/huggingface/transformers/issues/31974) - Fix if else and *actually* enable superfast init by [@​muellerzr](https://togithub.com/muellerzr) in [#​32007](https://togithub.com/huggingface/transformers/issues/32007) - SpeechEncoderDecoder doesn't support param buffer assignments by [@​muellerzr](https://togithub.com/muellerzr) in [#​32009](https://togithub.com/huggingface/transformers/issues/32009) - Fix tests skip by [@​qubvel](https://togithub.com/qubvel) in [#​32012](https://togithub.com/huggingface/transformers/issues/32012) - Fixed `log messages` that are resulting in TypeError due to too many arguments by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​32017](https://togithub.com/huggingface/transformers/issues/32017) - Fix typo in classification function selection logic to improve code consistency by [@​moses](https://togithub.com/moses) in [#​32031](https://togithub.com/huggingface/transformers/issues/32031) - doc: fix broken BEiT and DiNAT model links on Backbone page by [@​dvrogozh](https://togithub.com/dvrogozh) in [#​32029](https://togithub.com/huggingface/transformers/issues/32029) - Pass missing arguments to `SeamlessM4Tv2ConformerEncoderLayer.forward()` when gradient checkpointing is enabled by [@​anferico](https://togithub.com/anferico) in [#​31945](https://togithub.com/huggingface/transformers/issues/31945) - Add language to word timestamps for Whisper by [@​robinderat](https://togithub.com/robinderat) in [#​31572](https://togithub.com/huggingface/transformers/issues/31572) - Add `sdpa` and FA2 for CLIP by [@​qubvel](https://togithub.com/qubvel) in [#​31940](https://togithub.com/huggingface/transformers/issues/31940) - unpin `numpy<2.0` by [@​ydshieh](https://togithub.com/ydshieh) in [#​32018](https://togithub.com/huggingface/transformers/issues/32018) - Chameleon: minor fixes after shipping by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​32037](https://togithub.com/huggingface/transformers/issues/32037) - Bump scikit-learn from 1.0.2 to 1.5.0 in /examples/research_projects/decision_transformer by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​31458](https://togithub.com/huggingface/transformers/issues/31458) - Bump scikit-learn from 1.1.2 to 1.5.0 in /examples/research_projects/codeparrot/examples by [@​dependabot](https://togithub.com/dependabot)\[bot] in [#​32052](https://togithub.com/huggingface/transformers/issues/32052) - \[mistral] Support passing `head_dim` through config (and do not require `head_dim * num_heads == hidden_size`) by [@​xenova](https://togithub.com/xenova) in [#​32050](https://togithub.com/huggingface/transformers/issues/32050) - Add torch.compile Support For Mamba by [@​zhenglongjiepheonix](https://togithub.com/zhenglongjiepheonix) in [#​31247](https://togithub.com/huggingface/transformers/issues/31247) - fix: Removed `duplicate entries` in a dictionary by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​32041](https://togithub.com/huggingface/transformers/issues/32041) - docs: Fixed 2 links in the docs along with some minor fixes by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​32058](https://togithub.com/huggingface/transformers/issues/32058) - Llava: add default chat templates by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​31691](https://togithub.com/huggingface/transformers/issues/31691) - \[Chameleon, Hiera] Improve docs by [@​NielsRogge](https://togithub.com/NielsRogge) in [#​32038](https://togithub.com/huggingface/transformers/issues/32038) - Incorrect Whisper long-form decoding timestamps by [@​kamilakesbi](https://togithub.com/kamilakesbi) in [#​32003](https://togithub.com/huggingface/transformers/issues/32003) - \[mistral] Fix FA2 attention reshape for Mistral Nemo by [@​xenova](https://togithub.com/xenova) in [#​32065](https://togithub.com/huggingface/transformers/issues/32065) - VideoLLaVa: fix chat format in docs by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​32083](https://togithub.com/huggingface/transformers/issues/32083) - Fix progress callback deepcopy by [@​fozziethebeat](https://togithub.com/fozziethebeat) in [#​32070](https://togithub.com/huggingface/transformers/issues/32070) - Fixes to chameleon docs by [@​merveenoyan](https://togithub.com/merveenoyan) in [#​32078](https://togithub.com/huggingface/transformers/issues/32078) - Add image-text-to-text task guide by [@​merveenoyan](https://togithub.com/merveenoyan) in [#​31777](https://togithub.com/huggingface/transformers/issues/31777) - Support generating with fallback for short form audio in Whisper by [@​kamilakesbi](https://togithub.com/kamilakesbi) in [#​30984](https://togithub.com/huggingface/transformers/issues/30984) - Disable quick init for deepspeed by [@​muellerzr](https://togithub.com/muellerzr) in [#​32066](https://togithub.com/huggingface/transformers/issues/32066) - Chameleon: not supported with fast load by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​32091](https://togithub.com/huggingface/transformers/issues/32091) - Fix tests after `huggingface_hub` 0.24 by [@​Wauplin](https://togithub.com/Wauplin) in [#​32054](https://togithub.com/huggingface/transformers/issues/32054) - Fix shard order by [@​b-chu](https://togithub.com/b-chu) in [#​32023](https://togithub.com/huggingface/transformers/issues/32023) - Generate: store special token tensors under a unique variable name by [@​gante](https://togithub.com/gante) in [#​31980](https://togithub.com/huggingface/transformers/issues/31980) - fix: Replaced deprecated `mktemp()` function by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​32123](https://togithub.com/huggingface/transformers/issues/32123) - Mention model_info.id instead of model_info.modelId by [@​Wauplin](https://togithub.com/Wauplin) in [#​32106](https://togithub.com/huggingface/transformers/issues/32106) - \[generate] fix eos/pad id check on mps devices by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) in [#​31695](https://togithub.com/huggingface/transformers/issues/31695) - Fix failing test with race condition by [@​Rocketknight1](https://togithub.com/Rocketknight1) in [#​32140](https://togithub.com/huggingface/transformers/issues/32140) - Update `ko/_toctree.yml` and remove `custom_tools.md` to reflect latest changes by [@​jungnerd](https://togithub.com/jungnerd) in [#​31969](https://togithub.com/huggingface/transformers/issues/31969) - fix: Fixed raising `TypeError` instead of `ValueError` for invalid type by [@​Sai-Suraj-27](https://togithub.com/Sai-Suraj-27) in [#​32111](https://togithub.com/huggingface/transformers/issues/32111) - \[RoBERTa] Minor clarifications to model doc by [@​bt2513](https://togithub.com/bt2513) in [#​31949](https://togithub.com/huggingface/transformers/issues/31949) - Return assistant generated tokens mask in apply_chat_template by [@​yonigottesman](https://togithub.com/yonigottesman) in [#​30650](https://togithub.com/huggingface/transformers/issues/30650) - Don't default to other weights file when use_safetensors=True by [@​amyeroberts](https://togithub.com/amyeroberts) in [#​31874](https://togithub.com/huggingface/transformers/issues/31874) - set warning level to info for special tokens have been added by [@​ArthurZucker](https://togithub.com/ArthurZucker) in [#​32138](https://togithub.com/huggingface/transformers/issues/32138) - Add new quant method by [@​SunMarc](https://togithub.com/SunMarc) in [#​32047](https://togithub.com/huggingface/transformers/issues/32047) - Add llama3-llava-next-8b to llava_next conversion script by [@​jamt9000](https://togithub.com/jamt9000) in [#​31395](https://togithub.com/huggingface/transformers/issues/31395) - LLaVaNeXT: pad on right if training by [@​zucchini-nlp](https://togithub.com/zucchini-nlp) in [#​32134](https://togithub.com/huggingface/transformers/issues/32134) - Remove `trust_remote_code` when loading Libri Dummy by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) in [#​31748](https://togithub.com/huggingface/transformers/issues/31748) - \[modelling] remove un-necessary transpose for fa2 attention by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) in [#​31749](https://togithub.com/huggingface/transformers/issues/31749) - Fix mask creations of `GPTNeoX` and `GPT2` by [@​vasqu](https://togithub.com/vasqu) in [#​31944](https://togithub.com/huggingface/transformers/issues/31944) - Add method to retrieve used chat template by [@​KonradSzafer](https://togithub.com/KonradSzafer) in [#​32032](https://togithub.com/huggingface/transformers/issues/32032) - Add YaRN and Dynamic-YaRN RoPE Scaling Methods by [@​mig-mfreitas](https://togithub.com/mig-mfreitas) in [#​30910](https://togithub.com/huggingface/transformers/issues/30910) - Disable quick init for TapasPreTrainedModel by [@​daniellok-db](https://togithub.com/daniellok-db) in [#​32149](https://togithub.com/huggingface/transformers/issues/32149) - Modify resize_token_embeddings to ensure output type is same as input by [@​bayllama](https://togithub.com/bayllama) in [#​31979](https://togithub.com/huggingface/transformers/issues/31979) - gguf conversion add_prefix_space=None for llama3 by [@​itazap](https://togithub.com/itazap) in [#​31937](https://togithub.com/huggingface/transformers/issues/31937) - Fix flash attention speed issue by [@​Cyrilvallez](https://togithub.com/Cyrilvallez) in [#​32028](https://togithub.com/huggingface/transformers/issues/32028) - Fix video batching to videollava by [@​merveenoyan](https://togithub.com/merveenoyan) in [#​32139](https://togithub.com/huggingface/transformers/issues/32139) - Added mamba.py backend by [@​alxndrTL](https://togithub.com/alxndrTL) in [#​30139](https://togithub.com/huggingface/transformers/issues/30139) - Rename Phi-3 rope scaling type by [@​garg-amit](https://togithub.com/garg-amit) in [#​31436](https://togithub.com/huggingface/transformers/issues/31436) - Revert "Incorrect Whisper long-form decoding timestamps " by [@​sanchit-gandhi](https://togithub.com/sanchit-gandhi) in [#​32148](https://togithub.com/huggingface/transformers/issues/32148) - Fix typing to be compatible with later py versions by [@​amyeroberts](https://togithub.com/amyeroberts) in [#​32155](https://togithub.com/huggingface/transformers/issues/32155) - feat(cache): StaticCache uses index_copy\_ to avoid useless copy by [@​tengomucho](https://togithub.com/tengomucho) in [#​31857](https://togithub.com/huggingface/transformers/issues/31857) - Added additional kwarg for successful running of optuna hyperparameter search by [@​DeF0017](https://togithub.com/DeF0017) in [#​31924](https://togithub.com/huggingface/transformers/issues/31924) - Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs by [@​RhuiDih](https://togithub.com/RhuiDih) in [#​31629](https://togithub.com/huggingface/transformers/issues/31629) #### Significant community contributions The following contributors have made significant changes to the library over the last release: - [@​aliencaocao](https://togithub.com/aliencaocao) - Fix float out of range in owlvit and owlv2 when using FP16 or lower precision ([#​31657](https://togithub.com/huggingface/transformers/issues/31657)) - Add torch_empty_cache_steps to TrainingArguments ([#​31546](https://togithub.com/huggingface/transformers/issues/31546)) - Add training support for SigLIP ([#​31495](https://togithub.com/huggingface/transformers/issues/31495)) - Allow FP16 or other precision inference for Pipelines ([#​31342](https://togithub.com/huggingface/transformers/issues/31342)) - [@​voidism](https://togithub.com/voidism) - Generate: Add new decoding strategy "DoLa" in `.generate()` ([#​29619](https://togithub.com/huggingface/transformers/issues/29619)) - [@​Namangarg110](https://togithub.com/Namangarg110) - Adding hiera ([#​30356](https://togithub.com/huggingface/transformers/issues/30356))

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Enabled.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.



This PR was generated by Mend Renovate. View the repository job log.