stanford-crfm mistral issues

stanford-crfm / mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

Apache License 2.0

562 stars 49 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Port NaNDetector

#150 dlwh opened 2 years ago
0
tutorials

#149 J38 closed 2 years ago
0
fix remaining pre-commit issues

#148 dlwh closed 2 years ago
2
Add some Mistral Tutorials

#147 dlwh closed 2 years ago
1
update Mistral checkpoints so main branch points to checkpoint-400000

#146 dlwh closed 2 years ago
0
Get CI testing 2 gpu config

#145 dlwh closed 2 years ago
0
Dev

#144 J38 closed 2 years ago
0
please pre-commit gods

#143 dlwh closed 2 years ago
2
Dev

#142 J38 closed 2 years ago
0
Node arguments not parsed properly for torch.distributed.launch

#141 YianZhang closed 2 years ago
1
make CI keep up with dependencies, discover tests automatically, validate configs

#140 dlwh closed 2 years ago
1
make a more efficient IndexedDataset data store for storing tokenized datasets

#139 dlwh closed 2 years ago
1
Fork Preprocessing when doing multiple gpus

#138 dlwh closed 2 years ago
4
add pre-commit action, run pre-commit

#137 dlwh closed 2 years ago
8
broaden the values we accept to disable wandb

#136 dlwh closed 2 years ago
0
fix typo: shorter --> longer

#135 dlwh closed 2 years ago
0
Make sure we don't write anything (of significant size) to /tmp

#134 dlwh closed 2 years ago
2
WIP switch from quinine to yahp

#133 dlwh opened 2 years ago
3
Local tests

#132 dlwh closed 2 years ago
0
get_auto_dataset path logic does not work properly when dataset_id is a path

#131 dtsip closed 2 years ago
8
Switch to a supported config library

#130 dlwh opened 2 years ago
1
Update README and checkpoint info json

#129 J38 closed 2 years ago
0
Update Main

#128 J38 closed 2 years ago
0
WIP Support dataset streaming

#127 dlwh closed 2 years ago
2
Streaming data for larger datasets

#126 jthickstun closed 2 years ago
7
Remove override of _maybe_log_save_evaluate

#125 dlwh closed 2 years ago
0
remove activation logging since it's a dead code path

#124 dlwh closed 2 years ago
0
update documentation on accessing new Mistral checkpoints on HF Hub

#123 dlwh closed 2 years ago
1
revisit OnlineBenchmarkTrainer

#122 dlwh closed 2 years ago
1
Eval dataset is hard coded to be "openwebtext_ppl"

#121 dlwh closed 2 years ago
0
Update/pin some dependency versions for release

#120 dlwh closed 2 years ago
0
Requirements.txt vs conda env yamls

#119 dlwh closed 2 years ago
11
Generate Model Cards for models

#118 dlwh closed 2 years ago
1
Tokenization crashes when using deepspeed

#117 dlwh closed 2 years ago
0
Streaming and Sharded Data Loading

#116 dlwh opened 2 years ago
0
Investigate loss spikes

#115 dlwh opened 2 years ago
0
remove old _get_train_sampler since the relevant changes have been upstreamed

#114 dlwh closed 2 years ago
7
Add Tutorial On How To Restart From A Checkpoint

#113 J38 closed 2 years ago
0
Make local dataset configurable

#112 teetone closed 2 years ago
1
Finalize loading local data from .jsonl files and facilitating data blends

#111 J38 closed 2 years ago
29
[Low Priority/Optional] allow export to support more than two archs

#110 dlwh closed 2 years ago
1
[RFC] Mistral v2.0 Roadmap

#109 J38 closed 2 years ago
12
move gradient_checkpointing over to training_arguments since it's now…

#108 dlwh closed 2 years ago
6
add defaults to cerberos schema to help keep wikitext-103 working.

#107 dlwh closed 2 years ago
0
Dependency Conflict: huggingface-hub=0.0.2

#106 carlini closed 2 years ago
2
Add BERT

#105 michiyasunaga closed 2 years ago
0
Some changes to support batch jobs

#104 teetone closed 2 years ago
0
[Installation] Resolving dependency chain due to the latest Transformer version

#103 sameeravithana closed 3 years ago
12
Update deepspeed version

#102 J38 closed 3 years ago
0
set train_batch_size to auto

#101 J38 closed 3 years ago
0

Previous Next