rasbt LLMs-from-scratch issues

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

https://www.amazon.com/Build-Large-Language-Model-Scratch/dp/1633437167

Other

34.12k stars 4.18k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

04_optional-aws-sagemaker-notebook

#451 rvaneijk opened 3 days ago
1
Missing line of code if not mistaken

#447 AldawsariNLP closed 1 week ago
3
Bug in Exercise 5.6?

#444 SeriousJ55 closed 1 week ago
3
Improvement idea: MHA `d_out`

#443 d-kleine closed 1 week ago
1
[minor] typo & comments

#441 casinca closed 1 week ago
1
Fixed command for row 16 additional experiment

#439 d-kleine closed 2 weeks ago
0
Add flexible padding bonus experiment

#438 rasbt closed 2 weeks ago
0
Add utility to prevent double execution of certain cells

#437 rasbt closed 2 weeks ago
1
Add missing device transfer in optional gpt_generate.py code

#436 rasbt closed 2 weeks ago
0
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! in ch05/01_main-chapter-code/gpt_generate.py

#435 xfpg21421 closed 2 weeks ago
1
Small issue in notebook ch06.ipynb

#433 zia-hasan closed 2 weeks ago
1
Add "What's next" section

#432 rasbt closed 3 weeks ago
1
Incorrect explanation of the scaling of the non-dropped values in the mask

#431 chsharma27 closed 3 weeks ago
1
Dropout - activated value

#428 d-kleine closed 4 weeks ago
2
potential little fixes `appendix-D4 .ipynb`

#427 casinca closed 4 weeks ago
2
[typo] APPENDIX D.1 Learning rate warmup

#424 casinca closed 1 month ago
2
updated RoPE statement

#423 d-kleine closed 1 month ago
1
Fix incorrect argument name in LlamaTokenizer constructor

#421 rohanwinsor closed 1 month ago
1
minor fixes: Llama 3.2 standalone

#420 d-kleine closed 1 month ago
7
RoPE theta rescaling

#419 rasbt closed 1 month ago
1
Llama 3.2 standalone

#418 d-kleine closed 1 month ago
6
Toy example: Train on a dataset

#415 Iosifts closed 1 month ago
0
fixed typos

#414 d-kleine closed 1 month ago
5
Updated Llama 2 to 3 paths

#413 d-kleine closed 1 month ago
3
RoPE updates

#412 rasbt closed 1 month ago
5
RoPE - compute_rope mismatch of tensor dimensions

#411 rkinas closed 1 month ago
3
RoPE `inv_freq` code

#410 d-kleine closed 1 month ago
8
updates for PyTorch 2.5

#408 d-kleine closed 1 month ago
1
RoPE increase

#407 rasbt closed 1 month ago
1
Add mean pooling experiment to classifier bonus experiments

#406 rasbt closed 1 month ago
0
Test PyTorch 2.5

#405 rasbt closed 1 month ago
0
Note about SSL certificates

#404 rasbt closed 1 month ago
1
Update ch02.ipynb

#403 krouser closed 1 month ago
4
Best practices for memory efficient weight loading tutorial

#402 mikaylagawarecki closed 1 month ago
1
Memory efficient weight loading

#401 rasbt closed 1 month ago
1
Update bonus section formatting

#400 rasbt closed 1 month ago
0
commit on ziqi_main_scratch chap2

#398 ziqiyang107 closed 1 month ago
0
Ziqi scratch

#397 ziqiyang107 closed 1 month ago
0
Ziqi scratch

#396 ziqiyang107 closed 1 month ago
0
Add MFU formula as reference material

#395 rasbt closed 1 month ago
1
first commit on chap1 readme

#393 ziqiyang107 closed 1 month ago
0
When I tried to fine-tune the llama3-8b into a classifier, there was a problem

#392 YinSonglin1997 closed 1 month ago
7
Add Llama 3.2 RoPE to CI

#391 rasbt closed 1 month ago
0
help or not???????????

#390 DanielRojas20 closed 1 month ago
1
Introduce buffers to improve Llama 3.2 efficiency

#389 rasbt closed 1 month ago
1
fixed Llama 2 to 3.2 NBs

#388 d-kleine closed 1 month ago
4
LLama 3.2 1B model

#387 d-kleine closed 1 month ago
3
Add a note about weight tying in Llama 3.2

#386 rasbt closed 1 month ago
1
Llama 3

#384 rasbt closed 1 month ago
1
Implement Llama 3.2

#383 rasbt closed 1 month ago
1