issues
search
karpathy
/
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
37.49k
stars
5.97k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Merge for comprehension when filtering parameters without grad
#574
tsdeng
opened
1 day ago
0
Oren/amd mess
#573
OrenLeung
closed
1 week ago
0
Oren/config
#572
OrenLeung
closed
1 week ago
0
cancel
#571
Zhao-Yuting
closed
1 week ago
0
NaniGpt
#570
ashokkumar272
opened
3 weeks ago
0
added fix to type comparison to enable fused AdamW
#569
seanjudelyons
opened
3 weeks ago
0
Spring cleaning
#568
ckgresla
closed
3 weeks ago
0
How best to implement a differential transformer?
#567
Wilsontomass
opened
1 month ago
2
the things
#566
drisspg
closed
1 month ago
0
Normalized gpt
#565
santiagoakle
closed
4 weeks ago
1
Ddp do not sync when not needed
#564
OrenLeung
closed
1 month ago
0
Refactor to stop inductor mess
#563
OrenLeung
closed
1 month ago
0
Moe
#562
hellozmz
closed
1 month ago
0
Clean
#561
simran-arora
closed
1 month ago
0
Windows 11: FileExistsError: [WinError 183] Cannot create a file when that file already exists
#560
VyBui
opened
1 month ago
1
Update README.md
#559
eshwarram
closed
1 month ago
0
Updated README.md to include table of contents, why this project is useful, and how to contribute, and added an output for one command
#558
arhaque09
opened
1 month ago
0
Updated README.md to include table of contents, why this project is useful, and how to contribute
#557
arhaque09
closed
1 month ago
0
Updated README.md to include table of contents, why this project is useful, and how to contribute
#556
arhaque09
closed
1 month ago
0
Adding NVIDIA hardware performance detection
#555
fparisio
opened
2 months ago
0
Pretraining loss explosion
#554
mattgorb
opened
2 months ago
1
Add fire finetuning
#553
gkielian
opened
2 months ago
0
why is the warmup_iters set 2000?
#552
luxunxiansheng
opened
2 months ago
0
The Positional Encoding is not using sin / cos?
#551
mw66
opened
2 months ago
1
Remove flashattention from model.py
#550
chughtapan
closed
2 months ago
0
Implement muP and add code for mup guide blog
#549
ndey96
closed
2 months ago
0
Perplexity
#548
Precola
opened
2 months ago
0
Progressive training?
#547
immartian
opened
3 months ago
4
Add support for 0 temperature
#546
jmccrosky
opened
3 months ago
0
torchrun on L40S Error:torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
#545
Precola
closed
3 months ago
1
Rocm support?
#544
ilovethensa
opened
3 months ago
0
Calculation of Batch Size
#543
Precola
closed
3 months ago
1
configuration for Macs(apple silicon)
#542
bawsi99
opened
3 months ago
0
Adding gpt2 training experiment
#541
NewtonSander
closed
3 months ago
0
Use weights_only for loading
#540
kit1980
opened
3 months ago
0
What to change for training on two T4 GPUs ?
#539
noorchauhan
opened
3 months ago
1
Update train.py for more efficiency
#538
Jesseonmi
opened
4 months ago
0
Simple Use Case Demonstration with Old School Runescape Terminology
#537
Omarch47
opened
4 months ago
0
Solution to Exercise 1 from Youtube Lecture (Batching the heads) - Why does it work?
#536
Andrew-Luo1
closed
4 months ago
1
Nano GPT
#535
phanee123
opened
4 months ago
0
ddp on macbook CPU
#534
langong347
closed
4 months ago
0
free up state_dict variable memory after loading checkpoint
#533
adistomar
opened
4 months ago
0
FileNotFoundError: [Errno 2] No such file or directory: 'data/openwebtext/train.bin'
#532
HarikrishnanK9
opened
4 months ago
1
About the get_batch
#531
leo-young
opened
4 months ago
1
Add automatic detection of number of CPU cores
#530
Jakobovski
opened
4 months ago
1
Data cleaning for openwebtext
#529
zzkzzkjsw
opened
4 months ago
0
fix val dataset size code comment
#528
vhmth
opened
5 months ago
0
fix(train.py): mfu estimation to respect CPU-GPU sync point
#527
JasonLiJT
opened
5 months ago
0
code gpt v1
#526
shatrugna
closed
5 months ago
0
"RuntimeError: Internal Triton PTX codegen error" is raised when I train shakespeare_char with a GPU
#525
shenbb
opened
5 months ago
5
Next