The LLaMA model was proposed in LLaMA: Open and Efficient Foundation Language Models. It is a collection of foundation language models ranging from 7B to 65B parameters. You can request access to the weights here then use the conversion script to generate a checkpoint compatible with Hugging Face
Pix2Struct is a pretrained image-to-text model for purely visual language understanding, which can be finetuned on tasks containing visually-situated language. Pix2Struct has been fine-tuned on various tasks and datasets, ranging from image captioning and visual question answering (VQA) over different inputs (books, charts, science diagrams) to captioning UI components, and others.
MEGA proposes a new approach to self-attention with each encoder layer having a multi-headed exponential moving average in addition to a single head of standard dot-product attention, giving the attention mechanism stronger positional biases. This allows MEGA to perform competitively to Transformers on standard benchmarks including LRA while also having significantly fewer parameters. MEGA’s compute efficiency allows it to scale to very long sequences, making it an attractive option for long-document NLP tasks.
Add Mega: Moving Average Equipped Gated Attention by @mnaylor5 in #21766
GPTBigCode
The model is a an optimized GPT2 model with support for Multi-Query Attention.
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) by @jlamypoirier in #22575
NLLB-MoE
The mixture of experts version of the NLLB release has been added to the library.
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps transformers from 4.19.2 to 4.28.1.
Release notes
Sourced from transformers's releases.
... (truncated)
Commits
04ab560
Patch release: v4.28.1e1b4e8e
Revert (for now) the change onDeta
in #22437 (#22750)9417c92
Release: v4.28.072a978f
[Pix2struct] Simplify generation (#22527)50f82e1
Fix docstrings for TF BLIP (#22618)ce06e47
Update warning levels (#22727)9858195
add fast support and option (#22724)10fab90
torch.distributed
group initialization fortorch_neuron
disabled when `op...1306b7d
[tests] switch to torchrun (#22712)d87ef00
Modify pipeline_tutorial.mdx (#22726)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)