allenai OLMo issues - Githubissues

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

https://allenai.org/olmo

Apache License 2.0

4.48k stars 449 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Olmo tiny scripts

#628 ananyahjha93 closed 3 months ago
0
Key 'https://olmo_checkpoints' not in 'TrainConfig'

#627 jeqcho closed 2 months ago
1
Inspect training data improvements

#626 2015aroras closed 3 months ago
0
What is the true MLP ratio for OLMo 7B?

#625 jeqcho closed 3 months ago
2
Make olmo-core checkpointer more robust on weka

#624 epwalsh closed 3 months ago
0
HF dataset loading optimizations

#623 2015aroras closed 3 months ago
0
Cant use LORA

#622 bdytx5 opened 3 months ago
6
Config for Amberish experiments at 1B

#621 drschwenk opened 3 months ago
0
Running Amber experiments at 7B

#620 epwalsh closed 2 months ago
0
Add most OLMo 1.7-7B checkpoints

#619 2015aroras closed 3 months ago
0
Normal baselines

#618 AkshitaB opened 3 months ago
0
added git ref to the config keys

#617 drschwenk opened 3 months ago
0
Chameleon stability experiments

#616 AkshitaB closed 2 months ago
0
Officially add OLMo-core as a dependency

#615 epwalsh closed 3 months ago
0
Make include_instance_metadata a kwarg of build_train_dataloader

#614 2015aroras closed 3 months ago
0
Make include_instance_metadata a kwarg of build_train_dataloader

#613 2015aroras closed 3 months ago
0
adding DDP to the codebase

#612 ananyahjha93 closed 3 months ago
3
Read and use tokenizer identifier from config

#611 2015aroras closed 3 months ago
0
[HF Converter] Get tokenizer path from config as default

#610 2015aroras closed 3 months ago
0
Finetuning config file

#609 joellliu opened 3 months ago
3
How many tokens were trained for 7B model.

#608 mathfinder opened 3 months ago
1
Rewrite initialization

#607 AkshitaB closed 3 months ago
2
now accepts wandb project and entity as options

#606 drschwenk closed 4 months ago
2
Add option to record step size metrics from AdamW

#605 epwalsh closed 3 months ago
0
Adds a tool that diffs two wandb runs

#604 dirkgr closed 3 months ago
1
Unshard without passing checkpointer type

#603 2015aroras closed 4 months ago
1
fixed host-device sync at each clipping step

#602 ananyahjha93 closed 4 months ago
0
Fixes clipping

#601 ananyahjha93 closed 4 months ago
0
Remove usages of Auto* methods in hf_olmo tests

#600 2015aroras closed 4 months ago
0
Merging the train-olmo-large branch

#599 dirkgr closed 4 months ago
0
is_causal=attention_bias is None

#598 nkkbr opened 4 months ago
1
Default eos_token_id in `scripts/prepare-tulu-data.py`

#597 y0mingzhang closed 4 months ago
1
why is the total_grad_norm increasing across training?

#596 ryanyxw opened 4 months ago
5
Expose memmap_dtype in the data configuration

#595 leon-g-xu closed 4 months ago
2
Expose memmap dtype in data config

#594 leon-g-xu closed 4 months ago
2
Inspect training data without data indices

#593 2015aroras closed 4 months ago
0
training directly from object storage?

#592 joellliu closed 3 months ago
2
OLMoThreadError

#591 lecifire closed 3 months ago
2
Update README HF examples to use OLMo-1.7-7B

#590 2015aroras closed 4 months ago
0
Update docs with new HF checkpoint information

#589 2015aroras closed 4 months ago
0
Clarify that "auto" methods do not work with HF OLMo checkpoints

#588 2015aroras closed 4 months ago
0
Storage cleaner improvements

#587 2015aroras closed 4 months ago
0
Problem with HF loading from model checkpoint

#586 ryanyxw closed 4 months ago
5
Add documentation for OLMo checkpoints

#585 2015aroras closed 4 months ago
0
Problems with multi-epoch training

#584 Muennighoff opened 4 months ago
0
OLMo-1B's results seem very bad on olmo-eval

#583 Ivan-Zhou closed 4 months ago
0
Allow hybrid sharding to have multiple replicas in a node

#582 2015aroras closed 4 months ago
0
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

#581 mclanza closed 4 months ago
6
Update ignore_index parameter for flash attention

#580 2015aroras closed 4 months ago
0
Update train.py

#579 MLgdg closed 4 months ago
1

Previous Next