issues
search
epfml
/
llm-baselines
NanoGPT-like codebase for LLM training
MIT License
70
stars
21
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Merge from SOAP
#23
Andron00e
closed
1 week ago
0
Refactoring + reproducing AdEMAMix
#22
mpagli
closed
1 week ago
0
A bunch of new optimizers and schedules
#21
Andron00e
opened
2 weeks ago
0
Displaying grad-norm + support wandb with teams
#20
mpagli
closed
2 weeks ago
0
Eval on a fix subset + better lr decay
#19
mpagli
closed
2 weeks ago
0
add methods
#18
Andron00e
opened
2 weeks ago
13
Modified
#17
implicitfaith
opened
2 months ago
0
np.memmap memory leak and correct val sampling
#16
haeggee
opened
4 months ago
0
add fineweb dataset
#15
martinjaggi
opened
4 months ago
1
Memory requirements + baseline configs
#14
fabian-sp
opened
4 months ago
0
Checkpointing and retrieval
#13
NicolasRR
closed
6 months ago
4
license
#12
fakerybakery
closed
6 months ago
0
Create LICENSE
#11
haeggee
closed
6 months ago
2
WikiText Data
#10
thorinf
closed
6 months ago
0
implement torch dataloader
#9
haeggee
closed
7 months ago
2
Update utils.py - Fixed save_checkpoint with scheduler set to None
#8
peacefulotter
opened
11 months ago
0
try with pytorch compile for speedup?
#7
martinjaggi
closed
1 year ago
1
add openwebtext2 support
#6
mpagli
closed
1 year ago
0
add_wandb_key
#5
Olivia-fsm
closed
1 year ago
0
Modifying few things from mkrima PR
#4
mpagli
closed
1 year ago
0
Added datasets, tokenizers, and minor fixes
#3
AleHD
closed
1 year ago
0
separate config from main
#2
mkrima
closed
1 year ago
0
implement data-parallel distributed backend
#1
mkrima
closed
1 year ago
0