issues
search
karpathy
/
llm.c
LLM training in simple, raw C/CUDA
MIT License
20.76k
stars
2.22k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fused Forward GELU (again)
#591
ademeure
opened
5 hours ago
0
consolidate memory
#590
karpathy
closed
5 hours ago
0
Feature/overlap backward reduce
#589
karpathy
closed
7 hours ago
0
Add a debugging tip to README
#586
gordicaleksa
closed
12 hours ago
3
Fix MFU printing
#585
gordicaleksa
closed
7 hours ago
2
Bugfix eval dataloader out of bound file read and crash
#584
chinthysl
opened
20 hours ago
0
Replaced hard-coded max float with FLT_MAX
#583
vyom1611
opened
1 day ago
0
Benchmark modal script fixed - profiling and cuDNN (Issue #504 and PR #510 fixes)
#582
vyom1611
closed
6 hours ago
0
Add link for SYCL runtime
#581
abhilash1910
opened
1 day ago
0
[Feature Request] Intel SYCL runtime support for llm.c
#579
abhilash1910
closed
1 day ago
1
Remove cublaslt from fp32cu versions
#578
ngc92
closed
1 day ago
1
Adds dev container to speed up project onboarding
#577
nkkko
opened
2 days ago
2
Improve the performance of `softmax_forward8` in `dev/cuda`
#576
KarhouTam
opened
4 days ago
4
RMSNorm kernels
#575
AndreSlavescu
opened
4 days ago
2
Overlap computation and communication V2
#574
ngc92
closed
7 hours ago
0
Dataloader - introducing randomness
#573
gordicaleksa
opened
6 days ago
0
Model Export to Hugging Face format and optionally upload
#571
rhys101
opened
6 days ago
4
Adding GPU CI workflow file
#570
rosslwheeler
closed
5 days ago
0
is max_seq_len configurable or hardcoded parameter?
#569
morphpiece
opened
6 days ago
1
Improve logger, add grad norm + learning rate
#568
gordicaleksa
closed
6 days ago
1
Minor fix - better handling of out log dir
#567
gordicaleksa
closed
6 days ago
0
OSError: Memory mapping file failed: Cannot allocate memory
#566
antonkratz
closed
6 days ago
2
Consolidate memory
#565
ngc92
closed
5 hours ago
3
layernorm_forward kernel#1 with copy compute overlap is 3x faster than kernel#5
#564
josh-ramer
closed
6 days ago
5
Running `quick start on CPU` on Macbook Pro M2
#563
full-stack-ai
closed
6 days ago
7
LLM.c in google colab
#562
Eliah7
opened
1 week ago
1
Fix the compiler warnings and errors
#561
lancerts
opened
1 week ago
1
Fix mem leaks, reduce memory, refactor
#560
gordicaleksa
closed
1 week ago
0
set compile flag and add ci check
#559
ngc92
closed
1 week ago
1
Fix issue 555, support c++11 and fix compiler warnings
#558
lancerts
closed
1 week ago
2
I can not understand the `cublasGemmStridedBatchedEx` call in the `attention_forward`
#557
huoyushequ
closed
1 week ago
0
Utilities for cuda streams + disk IO
#556
ngc92
opened
1 week ago
0
apparent compatibility issues with earlier c++ versions after recent pushes
#555
hafezmg48
closed
1 week ago
3
added reading checkpoint files
#554
morphpiece
opened
1 week ago
0
Refactor trimat
#553
gordicaleksa
closed
1 week ago
1
Feature/streams
#552
karpathy
closed
5 days ago
2
Fix PyTorch DDP loss bug
#551
gordicaleksa
closed
1 week ago
1
Re-introduce cuda streams
#550
ngc92
closed
6 days ago
1
Trigger CI on new branch creation
#549
rosslwheeler
closed
1 week ago
1
Fix zero grads bug
#548
gordicaleksa
closed
1 week ago
0
made implicit includes of standard headers explicit
#547
ngc92
closed
1 week ago
1
Fix periodic inference during training
#546
kmyusk
closed
6 days ago
2
[attension.cuh] Move assert outside of attn kernel to launcher
#545
lancerts
opened
1 week ago
0
[layernorm.cuh] Minor fix with 32 replaced by WARP_SIZE
#544
lancerts
opened
1 week ago
0
Make clean fix and Windows cuDNN build fix
#543
rosslwheeler
closed
1 week ago
0
Adds simplified way to download the already tokenized tinyshakespeare dataset and gpt2 weights
#542
ChrisDryden
opened
1 week ago
0
move matmul
#541
karpathy
closed
1 week ago
0
move fused classifier
#540
karpathy
closed
1 week ago
0
move attention
#539
karpathy
closed
1 week ago
0
Removing dependencies that are seemingly not needed to compile
#538
ChrisDryden
closed
1 week ago
1
Next