Bump transformers from 4.22.2 to 4.23.1

Bumps transformers from 4.22.2 to 4.23.1.

Release notes

v4.23.1 Patch release

Fix a revert introduced by mistake making the "automatic-speech-recognition" for Whisper.

Fix whisper for pipeline by @ArthurZucker in #19482

v4.23.0: Whisper, Deformable DETR, Conditional DETR, MarkupLM, MSN, safetensors

Whisper

The Whisper model was proposed in Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever.

Whisper is an encoder-decoder Transformer trained on 680,000 hours of labeled (transcribed) audio. The model shows impressive performance and robustness in a zero-shot setting, in multiple languages.

Add WhisperModel to transformers by @ArthurZucker in #19166

Add TF whisper by @amyeroberts in #19378

Deformable DETR

The Deformable DETR model was proposed in Deformable DETR: Deformable Transformers for End-to-End Object Detection by Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai.

Deformable DETR mitigates the slow convergence issues and limited feature spatial resolution of the original DETR by leveraging a new deformable attention module which only attends to a small set of key sampling points around a reference.

Add Deformable DETR by @NielsRogge in #17281

[fix] Add DeformableDetrFeatureExtractor by @NielsRogge in #19140

Conditional DETR

The Conditional DETR model was proposed in Conditional DETR for Fast Training Convergence by Depu Meng, Xiaokang Chen, Zejia Fan, Gang Zeng, Houqiang Li, Yuhui Yuan, Lei Sun, Jingdong Wang.

Conditional DETR presents a conditional cross-attention mechanism for fast DETR training. Conditional DETR converges 6.7× to 10× faster than DETR.

Add support for conditional detr by @DeppMeng in #18948

Improve conditional detr docs by @NielsRogge in #19154

Time Series Transformer

The Time Series Transformer model is a vanilla encoder-decoder Transformer for time series forecasting.

The model is trained in a similar way to how one would train an encoder-decoder Transformer (like T5 or BART) for machine translation; i.e. teacher forcing is used. At inference time, one can autoregressively generate samples, one time step at a time.

:warning: This is a recently introduced model and modality, so the API hasn't been tested extensively. There may be some bugs or slight breaking changes to fix it in the future. If you see something strange, file a Github Issue.

time series forecasting model by @kashif in #17965

Masked Siamese Networks

The ViTMSN model was proposed in Masked Siamese Networks for Label-Efficient Learning by Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Michael Rabbat, Nicolas Ballas.

MSN (masked siamese networks) consists of a joint-embedding architecture to match the prototypes of masked patches with that of the unmasked patches. With this setup, the method yields excellent performance in the low-shot and extreme low-shot regimes for image classification, outperforming other self-supervised methods such as DINO. For instance, with 1% of ImageNet-1K labels, the method achieves 75.7% top-1 accuracy.

MSN (Masked Siamese Networks) for ViT by @sayakpaul in #18815

... (truncated)

Commits

bd469c4 Release: v4.23.1
c8bc0a0 Fix whisper for pipeline (#19482)
9ae22fe Release: v4.23.0
df2f281 wrap forward passes with torch.no_grad() (#19412)
5f5e264 wrap forward passes with torch.no_grad() (#19413)
c6a928c wrap forward passes with torch.no_grad() (#19414)
d739a70 wrap forward passes with torch.no_grad() (#19416)
870a954 wrap forward passes with torch.no_grad() (#19438)
692c5be wrap forward passes with torch.no_grad() (#19439)
a7bc422 fix (#19469)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

megagonlabs / bunkai

Bump transformers from 4.22.2 to 4.23.1 #169

v4.23.1 Patch release

v4.23.0: Whisper, Deformable DETR, Conditional DETR, MarkupLM, MSN, `safetensors`

Whisper

Deformable DETR

Conditional DETR

Time Series Transformer

Masked Siamese Networks