-
We are a team at @microsoft Research that has a fork Metaseq repo with these additional features:
1. New pipeline task to perform Knowledge Distillation via Log Probabilities using a modified Cross…
-
## 🚀 Feature Request
Congratulations on this great work - Scaling Autoregressive Multi-Modal Models:Pretraining and Instruction Tuning.
The paper mentions that `metaseq` was used for the model tra…
-
I tried to build the docker file (in order to be able to get a singularity image).
The build fails with this error message:
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-…
-
## 🚀 Feature Request
Metaseq has a number of rough edges that can make it difficult to use. This is a list of my proposals.
- [x] Print the first batch on ranker 0: when we start training, we shou…
-
## 🐛 Bug
Running a model parallel 2 with 8 gpus on FAIR cluster raises the following exception with the 1.3B_gptz model only when run with `arceasy`, `arcchallenge`, `openbookqa`. Works with `storycl…
-
# `m-1-x` models 🔰 (Seq2Seq with BART)
`m-1-x` versions are primarily meant to be as a demonstration, or piloting of the tools I'll be building. `1` means that the architecture does not change fro…
-
**System information**
- Alpa version: 0.2.2
- Are you willing to contribute it (Yes/No):
No.
**Describe the new feature and the current behavior/state**
Alpa supports OPT-175B currently. But t…
-
Error when running `metaseq_internal/fb_sweep/sweep_openlm_baselines.py` with model: `Size(1, 4, 2, 2, int(0.1 * M), 6.0e-4, 2)` in Megatron-LM: [stack trace](https://www.internalfb.com/phabricator/pa…
-
## 🐛 Bug
The "vocab_size" in config file is 50272 but the len(tokenizer) is 50265, they not match eacch other.
### To Reproduce
Steps to reproduce the behavior (**always include the command y…
-
## ❓ Questions and Help
Hi team, I want to fine-tune OPT on my own dataset using code in metaseq. I want to fine-tune the 30B opt model so I guess hf is not a good choice.
Is there any way to c…
Dod-o updated
2 years ago