issues
search
stanford-crfm
/
mistral
Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.
Apache License 2.0
562
stars
49
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Will this repo be actively maintained?
#201
satyamsundaram
closed
4 months ago
0
minor fix for getting-start page
#200
panpan0000
opened
11 months ago
0
arwen is checkpoint progression outlier?
#199
hawkrobe
opened
1 year ago
12
feat: move mypy configuration to `pyproject.toml`
#198
SauravMaheshkar
closed
1 year ago
2
Speed up pre-training
#197
yandachen
opened
1 year ago
0
torch_extensions/py38_cu113/fused_adam/fused_adam.so: cannot open shared object file
#196
yandachen
opened
1 year ago
6
No validation set in openwebtext leads to failure.
#195
john-hewitt
opened
1 year ago
13
Distributed Training: Data seem to be identical across processes
#194
HanGuo97
closed
1 year ago
3
State of the repo, W&B Dashboards, and Relationships with GPT-NeoX/HuggingFace
#193
HanGuo97
closed
1 year ago
4
NFS manifests for the Kubernetes tutorial are missing
#192
AntreasAntoniou
opened
1 year ago
1
Replicating training / test split on models
#191
Uzay-G
opened
1 year ago
4
Update CONTRIBUTING.md
#190
dlwh
closed
2 years ago
0
Last batch dropped during preprocessing
#189
toizzy
closed
2 years ago
1
Were the mistral models trained with dropout?
#188
ArthurConmy
closed
2 years ago
2
tutorial doesn't work out of the box
#187
dlwh
opened
2 years ago
3
Make a loss regression test for Mistral
#186
dlwh
opened
2 years ago
0
New Pull Request template refers to a `dev` branch, which does not exist
#185
yifanmai
opened
2 years ago
2
Add pyarrow>=7.0.0 to requirements
#184
yifanmai
closed
2 years ago
1
pyarrow dependency should be >= 7.0.0
#183
yifanmai
closed
2 years ago
0
Difference in models uploaded before and after June 20th?
#182
msclar
closed
2 years ago
3
Dev --> main
#181
dlwh
closed
2 years ago
0
fix pre-commit
#179
J38
closed
2 years ago
0
Multi-machine Data Processing Fails
#178
J38
opened
2 years ago
0
add mistral models
#177
J38
closed
2 years ago
0
Update main
#176
J38
closed
2 years ago
0
Finish gpt2 rename
#175
dlwh
closed
2 years ago
0
Finish gpt2-* -> mistral-* rename
#174
dlwh
closed
2 years ago
0
Make IndexedDataset use relative cache paths for cache files
#173
dlwh
closed
2 years ago
0
Add Codalab Tutorial doc
#172
dlwh
opened
2 years ago
0
Reenable the test to make sure the evaluator works...
#171
dlwh
closed
2 years ago
0
ensure that dataloader_num_workers is 0
#170
dlwh
closed
2 years ago
2
`debug.yaml` => `gpt2-small-short.yaml`
#169
teetone
closed
2 years ago
0
Shakespeare example (for tutorial + demo)
#168
teetone
closed
2 years ago
0
A passthrough tokenizer for pre-tokenized datasets
#167
jthickstun
closed
2 years ago
1
make indexed dataset write local paths so it's easier to move the cache around
#166
dlwh
closed
2 years ago
1
fix indexer behavior when there aren't many docs
#165
dlwh
closed
2 years ago
1
Some changes to run on CodaLab
#164
teetone
closed
2 years ago
0
Mistral Micro Eval Crashes With DeepSpeed
#163
J38
closed
2 years ago
2
Indexed Dataset caches contain absolute path references
#162
jthickstun
closed
2 years ago
2
Freezes with Pytorch 1.12 and DeepSpeed
#161
dlwh
opened
2 years ago
0
See Mistral Issues On Slack
#160
J38
closed
2 years ago
0
fix #141: Node arguments not parsed properly for torch.distributed.launch
#159
dlwh
closed
2 years ago
0
fix deepspeed on build server
#158
dlwh
closed
2 years ago
0
switch to using my pre-detokenized wikitext to make tests a bit faster...
#157
dlwh
closed
2 years ago
1
update README
#156
J38
closed
2 years ago
0
Deepspeed tests
#155
dlwh
closed
2 years ago
1
pin pytorch dependency at 1.11, future proof imports for 1.12
#154
dlwh
closed
2 years ago
0
update the differences doc
#153
dlwh
closed
2 years ago
0
update main
#152
J38
closed
2 years ago
0
Clean artifacts before tests run
#151
J38
closed
2 years ago
0
Next