mlfoundations open_lm issues

mlfoundations / open_lm

A repository for research on medium sized language models.

MIT License

316 stars 39 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add option to predownload data from s3 at the start of each checkpoint.

#280 GeorgiosSmyrnis opened 18 hours ago
0
Remote Sync FSSPEC cannot upload large checkpoints

#279 Skylion007 opened 21 hours ago
0
re-allow an unbalanced write for tokenize-shuffle

#278 jeffreywpli closed 13 hours ago
0
Further improve errors.

#277 GeorgiosSmyrnis closed 2 days ago
0
Improve error logging / checkpointing even further.

#276 GeorgiosSmyrnis closed 4 days ago
0
Improve error message.

#275 GeorgiosSmyrnis closed 4 days ago
0
Update README.md

#274 jmercat closed 5 days ago
0
Improve error handling for s3 read errors.

#273 GeorgiosSmyrnis closed 5 days ago
2
Even better fix for long pages

#272 jeffreywpli closed 15 hours ago
0
make tokenize-shuffle more robust to long pages

#271 jeffreywpli closed 1 week ago
0
Add error message.

#270 GeorgiosSmyrnis closed 1 week ago
0
Add exponential backoff to remote sync.

#269 GeorgiosSmyrnis closed 1 week ago
0
Don't set log level for downstream modules

#268 achalddave closed 2 weeks ago
1
xfomers installation failed

#267 stevensf1998 closed 2 weeks ago
6
use different_seed=True when args.load_pretrained_state is false

#266 jeffreywpli closed 2 weeks ago
0
replace moe warning with an error check

#265 sagadre closed 2 weeks ago
0
Updating paths in readme to reflect current codebase

#264 arjunsesh closed 2 weeks ago
0
Parameter input rotary-freq

#263 jmercat opened 2 weeks ago
1
samples_per_second_per_gpu or tokens_per_second_per_gpu?

#262 Muennighoff opened 2 weeks ago
1
Reduce logging when --torchcompile is passed

#261 achalddave closed 2 weeks ago
0
Add loss like Rho-1

#260 GeorgiosSmyrnis opened 3 weeks ago
0
Fix order of loss resets.

#259 GeorgiosSmyrnis closed 1 week ago
0
sagemaker use reserved instances

#258 sedrick-keh-tri closed 3 weeks ago
1
Add dMoE

#257 Muennighoff opened 3 weeks ago
0
Checkpoint skipping.

#256 GeorgiosSmyrnis opened 4 weeks ago
0
Fix MoE

#255 Muennighoff closed 3 weeks ago
1
Mamba update

#254 jmercat opened 1 month ago
0
MoE performs worse than equivalent dense model?

#253 Muennighoff closed 3 weeks ago
3
Add xformers vs torch test

#252 achalddave closed 1 month ago
1
MoE Expert parallelism config

#251 Muennighoff opened 1 month ago
0
Add attention masking support for torch_attn

#250 achalddave closed 1 month ago
0
Fix torchcompile loading.

#249 GeorgiosSmyrnis closed 1 month ago
0
HF Integration

#248 sedrick-keh-tri opened 1 month ago
1
Someone is using your project to sell it as a token

#247 yzthink opened 1 month ago
1
Input embed

#246 jmercat closed 1 month ago
2
Bug fix to import Llama in OpenLM.

#245 kushal-tri opened 1 month ago
0
Optimize torchcompile.

#244 GeorgiosSmyrnis closed 1 month ago
0
adding cosine rewarmed scheduler

#243 Tomerporian opened 1 month ago
0
Presorting tokenize shuffle (useful for in-context-learning)

#242 revbucket closed 3 weeks ago
1
minor bugfix: Don't access losses_avg_m if not initialized

#241 achalddave closed 1 month ago
0
Remove steps from const_lr

#240 achalddave closed 1 month ago
0
Change GeGLU and add MQA.

#239 GeorgiosSmyrnis opened 1 month ago
0
Adding averaging of iterates

#238 Tomerporian closed 1 month ago
0
Reduce prints in tokenize shuffle

#237 achalddave closed 1 month ago
0
v2 small-scale configs

#236 sagadre closed 1 month ago
1
Generation fixes

#235 achalddave closed 2 months ago
1
main.py: set_grad_checkpointing before DDP or FSDP wrap

#234 iejMac closed 2 months ago
1
Add unit test for source mixing + Fix naming within tars.

#233 GeorgiosSmyrnis opened 2 months ago
1
Version bump.

#232 GeorgiosSmyrnis opened 2 months ago
0
Revert "Add the ability to preset world size."

#231 GeorgiosSmyrnis closed 2 months ago
1