Update transformers requirement from <4.20,>=4.1 to >=4.1,<4.21

Updates the requirements on transformers to permit the latest version.

Release notes

v4.20.0 Big Model infernece, BLOOM, CvT, GPT Neo-X, LayoutLMv3, LeViT, LongT5, M-CTC-T, Trajectory Transformer and Wav2Vec2-Conformer

Big model inference

You can now use the big model inference of Accelerate directly in any call to from_pretrained by specifying device_map="auto" (or your own device_map). It will automatically load the model taking advantage of your GPU(s) then offloading what doesn't fit in RAM, or even on the hard drive if you don't have RAM. Your model can then be used normally for inference without anything else to do.
from transformers import AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained(
"bigscience/T0pp", revision="sharded", device_map="auto"
)
Use Accelerate in from_pretrained for big model inference by @sgugger in #17341

BLOOM

The BLOOM model has been proposed with its various versions through the BigScience Workshop. The architecture of BLOOM is essentially similar to GPT3 (auto-regressive model for next token prediction), but has been trained on different 46 languages including code.

BLOOM by @younesbelkada in #17474

CvT

The Convolutional vision Transformer (CvT) improves the Vision Transformer (ViT) in performance and efficiency by introducing convolutions into ViT to yield the best of both designs.

Add CvT by @NielsRogge and @AnugunjNaman in #17299

GPT Neo-X

GPT-NeoX-20B is a 20 billion parameter autoregressive language model trained on the Pile, whose weights are made freely and openly available to the public through a permissive license. GPT-NeoX-20B is a particularly powerful few-shot reasoner and gains far more in performance when evaluated five-shot than similarly sized GPT-3 and FairSeq models.

Adding GPT-NeoX-20B by @zphang in #16659

LayoutLMv3

LayoutLMv3 simplifies LayoutLMv2 by using patch embeddings (as in ViT) instead of leveraging a CNN backbone, and pre-trains the model on 3 objectives: masked language modeling (MLM), masked image modeling (MIM) and word-patch alignment (WPA).

Add LayoutLMv3 by @NielsRogge in #17060

LeViT

LeViT improves the Vision Transformer (ViT) in performance and efficiency by a few architectural differences such as activation maps with decreasing resolutions in Transformers and the introduction of an attention bias to integrate positional information.

Adding LeViT Model by Facebook by @AnugunjNaman in #17466

LongT5

LongT5 model is an extension of T5 model, and it enables using one of the two different efficient attention mechanisms - (1) Local attention, or (2) Transient-Global attention. It is capable of handling input sequences of a length up to 16,384 tokens.

... (truncated)

Commits

39b4aba Release: v4.20.0
90c8c01 Refine Bf16 test for deepspeed (#17734)
f8c8f4d Fix tf shared embedding (#17730)
3981ee8 Sort the model doc Toc Alphabetically (#17723)
66f8933 normalize keys_to_ignore (#17722)
c3c62b5 CLI: Add flag to push TF weights directly into main (#17720)
6ebeeee Update requirements.txt (#17719)
50415b8 Revert "Change push CI to run on workflow_run event (#17692)" (#17717)
7f14839 [Wav2Vec2Conformer] Official release (#17709)
242cc6e Documentation: RemBERT fixes (#17641)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

allenai / allennlp

Update transformers requirement from <4.20,>=4.1 to >=4.1,<4.21 #5675

v4.20.0 Big Model infernece, BLOOM, CvT, GPT Neo-X, LayoutLMv3, LeViT, LongT5, M-CTC-T, Trajectory Transformer and Wav2Vec2-Conformer

Big model inference

BLOOM

CvT

GPT Neo-X

LayoutLMv3

LeViT

LongT5