issues
search
SeanNaren
/
min-LLM
Minimal code to train a Large Language Model (LLM).
MIT License
164
stars
8
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Feat/extras
#20
SeanNaren
closed
2 years ago
0
Add performance section to README
#19
SeanNaren
opened
2 years ago
0
Feat/extras
#18
SeanNaren
closed
2 years ago
0
Use the OSCAR dataset/GPT-2 tokenizer
#17
SeanNaren
closed
2 years ago
0
Improve DeepSpeed Stage 3 Throughput
#16
SeanNaren
opened
2 years ago
0
Fix model initialisation
#15
SeanNaren
opened
2 years ago
0
Fuse MLP in attention mechanism
#14
SeanNaren
opened
2 years ago
1
Improving Throughput with DeepSpeed
#13
SeanNaren
closed
2 years ago
2
DeepSpeed Support
#12
SeanNaren
closed
2 years ago
0
Using FSDP
#11
SeanNaren
opened
2 years ago
4
Fixing DeepSpeed + BFloat16
#10
SeanNaren
closed
2 years ago
6
Finding Optimal Throughput for our Model :scientist:
#9
SeanNaren
closed
2 years ago
3
Investigating tinycudann MLP layer
#8
SeanNaren
closed
2 years ago
6
Training Data :mag_right:
#7
SeanNaren
closed
2 years ago
7
High Level Plan for the Journey!
#6
SeanNaren
opened
2 years ago
0
Deciding the model to train (and the base code to profile!)
#5
SeanNaren
closed
2 years ago
19
Take our modified xformers microGPT script and add Profiling
#4
SeanNaren
closed
2 years ago
1
README :speech_balloon:
#2
SeanNaren
closed
2 years ago
0