issues
search
mlfoundations
/
open_lm
A repository for research on medium sized language models.
MIT License
316
stars
39
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add option to predownload data from s3 at the start of each checkpoint.
#280
GeorgiosSmyrnis
opened
18 hours ago
0
Remote Sync FSSPEC cannot upload large checkpoints
#279
Skylion007
opened
21 hours ago
0
re-allow an unbalanced write for tokenize-shuffle
#278
jeffreywpli
closed
13 hours ago
0
Further improve errors.
#277
GeorgiosSmyrnis
closed
2 days ago
0
Improve error logging / checkpointing even further.
#276
GeorgiosSmyrnis
closed
4 days ago
0
Improve error message.
#275
GeorgiosSmyrnis
closed
4 days ago
0
Update README.md
#274
jmercat
closed
5 days ago
0
Improve error handling for s3 read errors.
#273
GeorgiosSmyrnis
closed
5 days ago
2
Even better fix for long pages
#272
jeffreywpli
closed
15 hours ago
0
make tokenize-shuffle more robust to long pages
#271
jeffreywpli
closed
1 week ago
0
Add error message.
#270
GeorgiosSmyrnis
closed
1 week ago
0
Add exponential backoff to remote sync.
#269
GeorgiosSmyrnis
closed
1 week ago
0
Don't set log level for downstream modules
#268
achalddave
closed
2 weeks ago
1
xfomers installation failed
#267
stevensf1998
closed
2 weeks ago
6
use different_seed=True when args.load_pretrained_state is false
#266
jeffreywpli
closed
2 weeks ago
0
replace moe warning with an error check
#265
sagadre
closed
2 weeks ago
0
Updating paths in readme to reflect current codebase
#264
arjunsesh
closed
2 weeks ago
0
Parameter input rotary-freq
#263
jmercat
opened
2 weeks ago
1
samples_per_second_per_gpu or tokens_per_second_per_gpu?
#262
Muennighoff
opened
2 weeks ago
1
Reduce logging when --torchcompile is passed
#261
achalddave
closed
2 weeks ago
0
Add loss like Rho-1
#260
GeorgiosSmyrnis
opened
3 weeks ago
0
Fix order of loss resets.
#259
GeorgiosSmyrnis
closed
1 week ago
0
sagemaker use reserved instances
#258
sedrick-keh-tri
closed
3 weeks ago
1
Add dMoE
#257
Muennighoff
opened
3 weeks ago
0
Checkpoint skipping.
#256
GeorgiosSmyrnis
opened
4 weeks ago
0
Fix MoE
#255
Muennighoff
closed
3 weeks ago
1
Mamba update
#254
jmercat
opened
1 month ago
0
MoE performs worse than equivalent dense model?
#253
Muennighoff
closed
3 weeks ago
3
Add xformers vs torch test
#252
achalddave
closed
1 month ago
1
MoE Expert parallelism config
#251
Muennighoff
opened
1 month ago
0
Add attention masking support for torch_attn
#250
achalddave
closed
1 month ago
0
Fix torchcompile loading.
#249
GeorgiosSmyrnis
closed
1 month ago
0
HF Integration
#248
sedrick-keh-tri
opened
1 month ago
1
Someone is using your project to sell it as a token
#247
yzthink
opened
1 month ago
1
Input embed
#246
jmercat
closed
1 month ago
2
Bug fix to import Llama in OpenLM.
#245
kushal-tri
opened
1 month ago
0
Optimize torchcompile.
#244
GeorgiosSmyrnis
closed
1 month ago
0
adding cosine rewarmed scheduler
#243
Tomerporian
opened
1 month ago
0
Presorting tokenize shuffle (useful for in-context-learning)
#242
revbucket
closed
3 weeks ago
1
minor bugfix: Don't access losses_avg_m if not initialized
#241
achalddave
closed
1 month ago
0
Remove steps from const_lr
#240
achalddave
closed
1 month ago
0
Change GeGLU and add MQA.
#239
GeorgiosSmyrnis
opened
1 month ago
0
Adding averaging of iterates
#238
Tomerporian
closed
1 month ago
0
Reduce prints in tokenize shuffle
#237
achalddave
closed
1 month ago
0
v2 small-scale configs
#236
sagadre
closed
1 month ago
1
Generation fixes
#235
achalddave
closed
2 months ago
1
main.py: set_grad_checkpointing before DDP or FSDP wrap
#234
iejMac
closed
2 months ago
1
Add unit test for source mixing + Fix naming within tars.
#233
GeorgiosSmyrnis
opened
2 months ago
1
Version bump.
#232
GeorgiosSmyrnis
opened
2 months ago
0
Revert "Add the ability to preset world size."
#231
GeorgiosSmyrnis
closed
2 months ago
1
Next