bclarkson-code / Tricycle

Autograd to GPT-2 completely from scratch
104 stars 7 forks source link

Fully functional training script #48

Closed bclarkson-code closed 4 months ago

bclarkson-code commented 4 months ago

Brace yourself, this is a big one. This PR is the first PR that contains a fully functional training script for training a model on the shakespeare dataset. It should definitely have been split into multiple smaller PR's but I kind of got carried away Changes include: