stanford-crfm mistral issues

stanford-crfm / mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

Apache License 2.0

562 stars 49 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Will this repo be actively maintained?

#201 satyamsundaram closed 4 months ago
0
minor fix for getting-start page

#200 panpan0000 opened 11 months ago
0
arwen is checkpoint progression outlier?

#199 hawkrobe opened 1 year ago
12
feat: move mypy configuration to `pyproject.toml`

#198 SauravMaheshkar closed 1 year ago
2
Speed up pre-training

#197 yandachen opened 1 year ago
0
torch_extensions/py38_cu113/fused_adam/fused_adam.so: cannot open shared object file

#196 yandachen opened 1 year ago
6
No validation set in openwebtext leads to failure.

#195 john-hewitt opened 1 year ago
13
Distributed Training: Data seem to be identical across processes

#194 HanGuo97 closed 1 year ago
3
State of the repo, W&B Dashboards, and Relationships with GPT-NeoX/HuggingFace

#193 HanGuo97 closed 1 year ago
4
NFS manifests for the Kubernetes tutorial are missing

#192 AntreasAntoniou opened 1 year ago
1
Replicating training / test split on models

#191 Uzay-G opened 1 year ago
4
Update CONTRIBUTING.md

#190 dlwh closed 2 years ago
0
Last batch dropped during preprocessing

#189 toizzy closed 2 years ago
1
Were the mistral models trained with dropout?

#188 ArthurConmy closed 2 years ago
2
tutorial doesn't work out of the box

#187 dlwh opened 2 years ago
3
Make a loss regression test for Mistral

#186 dlwh opened 2 years ago
0
New Pull Request template refers to a `dev` branch, which does not exist

#185 yifanmai opened 2 years ago
2
Add pyarrow>=7.0.0 to requirements

#184 yifanmai closed 2 years ago
1
pyarrow dependency should be >= 7.0.0

#183 yifanmai closed 2 years ago
0
Difference in models uploaded before and after June 20th?

#182 msclar closed 2 years ago
3
Dev --> main

#181 dlwh closed 2 years ago
0
fix pre-commit

#179 J38 closed 2 years ago
0
Multi-machine Data Processing Fails

#178 J38 opened 2 years ago
0
add mistral models

#177 J38 closed 2 years ago
0
Update main

#176 J38 closed 2 years ago
0
Finish gpt2 rename

#175 dlwh closed 2 years ago
0
Finish gpt2-* -> mistral-* rename

#174 dlwh closed 2 years ago
0
Make IndexedDataset use relative cache paths for cache files

#173 dlwh closed 2 years ago
0
Add Codalab Tutorial doc

#172 dlwh opened 2 years ago
0
Reenable the test to make sure the evaluator works...

#171 dlwh closed 2 years ago
0
ensure that dataloader_num_workers is 0

#170 dlwh closed 2 years ago
2
`debug.yaml` => `gpt2-small-short.yaml`

#169 teetone closed 2 years ago
0
Shakespeare example (for tutorial + demo)

#168 teetone closed 2 years ago
0
A passthrough tokenizer for pre-tokenized datasets

#167 jthickstun closed 2 years ago
1
make indexed dataset write local paths so it's easier to move the cache around

#166 dlwh closed 2 years ago
1
fix indexer behavior when there aren't many docs

#165 dlwh closed 2 years ago
1
Some changes to run on CodaLab

#164 teetone closed 2 years ago
0
Mistral Micro Eval Crashes With DeepSpeed

#163 J38 closed 2 years ago
2
Indexed Dataset caches contain absolute path references

#162 jthickstun closed 2 years ago
2
Freezes with Pytorch 1.12 and DeepSpeed

#161 dlwh opened 2 years ago
0
See Mistral Issues On Slack

#160 J38 closed 2 years ago
0
fix #141: Node arguments not parsed properly for torch.distributed.launch

#159 dlwh closed 2 years ago
0
fix deepspeed on build server

#158 dlwh closed 2 years ago
0
switch to using my pre-detokenized wikitext to make tests a bit faster...

#157 dlwh closed 2 years ago
1
update README

#156 J38 closed 2 years ago
0
Deepspeed tests

#155 dlwh closed 2 years ago
1
pin pytorch dependency at 1.11, future proof imports for 1.12

#154 dlwh closed 2 years ago
0
update the differences doc

#153 dlwh closed 2 years ago
0
update main

#152 J38 closed 2 years ago
0
Clean artifacts before tests run

#151 J38 closed 2 years ago
0