issues
search
rasbt
/
LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
https://www.amazon.com/Build-Large-Language-Model-Scratch/dp/1633437167
Other
34.12k
stars
4.18k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
04_optional-aws-sagemaker-notebook
#451
rvaneijk
opened
3 days ago
1
Missing line of code if not mistaken
#447
AldawsariNLP
closed
1 week ago
3
Bug in Exercise 5.6?
#444
SeriousJ55
closed
1 week ago
3
Improvement idea: MHA `d_out`
#443
d-kleine
closed
1 week ago
1
[minor] typo & comments
#441
casinca
closed
1 week ago
1
Fixed command for row 16 additional experiment
#439
d-kleine
closed
2 weeks ago
0
Add flexible padding bonus experiment
#438
rasbt
closed
2 weeks ago
0
Add utility to prevent double execution of certain cells
#437
rasbt
closed
2 weeks ago
1
Add missing device transfer in optional gpt_generate.py code
#436
rasbt
closed
2 weeks ago
0
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! in ch05/01_main-chapter-code/gpt_generate.py
#435
xfpg21421
closed
2 weeks ago
1
Small issue in notebook ch06.ipynb
#433
zia-hasan
closed
2 weeks ago
1
Add "What's next" section
#432
rasbt
closed
3 weeks ago
1
Incorrect explanation of the scaling of the non-dropped values in the mask
#431
chsharma27
closed
3 weeks ago
1
Dropout - activated value
#428
d-kleine
closed
4 weeks ago
2
potential little fixes `appendix-D4 .ipynb`
#427
casinca
closed
4 weeks ago
2
[typo] APPENDIX D.1 Learning rate warmup
#424
casinca
closed
1 month ago
2
updated RoPE statement
#423
d-kleine
closed
1 month ago
1
Fix incorrect argument name in LlamaTokenizer constructor
#421
rohanwinsor
closed
1 month ago
1
minor fixes: Llama 3.2 standalone
#420
d-kleine
closed
1 month ago
7
RoPE theta rescaling
#419
rasbt
closed
1 month ago
1
Llama 3.2 standalone
#418
d-kleine
closed
1 month ago
6
Toy example: Train on a dataset
#415
Iosifts
closed
1 month ago
0
fixed typos
#414
d-kleine
closed
1 month ago
5
Updated Llama 2 to 3 paths
#413
d-kleine
closed
1 month ago
3
RoPE updates
#412
rasbt
closed
1 month ago
5
RoPE - compute_rope mismatch of tensor dimensions
#411
rkinas
closed
1 month ago
3
RoPE `inv_freq` code
#410
d-kleine
closed
1 month ago
8
updates for PyTorch 2.5
#408
d-kleine
closed
1 month ago
1
RoPE increase
#407
rasbt
closed
1 month ago
1
Add mean pooling experiment to classifier bonus experiments
#406
rasbt
closed
1 month ago
0
Test PyTorch 2.5
#405
rasbt
closed
1 month ago
0
Note about SSL certificates
#404
rasbt
closed
1 month ago
1
Update ch02.ipynb
#403
krouser
closed
1 month ago
4
Best practices for memory efficient weight loading tutorial
#402
mikaylagawarecki
closed
1 month ago
1
Memory efficient weight loading
#401
rasbt
closed
1 month ago
1
Update bonus section formatting
#400
rasbt
closed
1 month ago
0
commit on ziqi_main_scratch chap2
#398
ziqiyang107
closed
1 month ago
0
Ziqi scratch
#397
ziqiyang107
closed
1 month ago
0
Ziqi scratch
#396
ziqiyang107
closed
1 month ago
0
Add MFU formula as reference material
#395
rasbt
closed
1 month ago
1
first commit on chap1 readme
#393
ziqiyang107
closed
1 month ago
0
When I tried to fine-tune the llama3-8b into a classifier, there was a problem
#392
YinSonglin1997
closed
1 month ago
7
Add Llama 3.2 RoPE to CI
#391
rasbt
closed
1 month ago
0
help or not???????????
#390
DanielRojas20
closed
1 month ago
1
Introduce buffers to improve Llama 3.2 efficiency
#389
rasbt
closed
1 month ago
1
fixed Llama 2 to 3.2 NBs
#388
d-kleine
closed
1 month ago
4
LLama 3.2 1B model
#387
d-kleine
closed
1 month ago
3
Add a note about weight tying in Llama 3.2
#386
rasbt
closed
1 month ago
1
Llama 3
#384
rasbt
closed
1 month ago
1
Implement Llama 3.2
#383
rasbt
closed
1 month ago
1
Next