jsbaan / transformer-from-scratch

Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.
193 stars 39 forks source link

Transformer unit tests Mypy Type Checking Code style: black

Implementing A Transformer From Scratch

To get intimately familiar with the nuts and bolts of transformers I decided to implement the original architecture from Attention Is All You Need.

This repo accompanies the blogpost Implementing a Transformer From Scratch: 7 surprising things you might not know about the Transformer. I wrote this blogpost to highlight things that I learned in the process and that I found particularly surprising or insightful.

Structure

Each Python file contains one or more classes related to the transformer. Additionally, at the bottom of each file you can find unit tests for that class. These unit tests are executed simply by running the file (e.g. python transformer.py), and are run on every push to this repo using Github Actions. They serve two purposes. First, they are sanity checks that verify whether the class is doing what it should. Second, they are examples for how to use each class.

In practice, of course, please do use the official PyTorch implementation. This repo is by no means meant as an alternative: it is meant to help me (and hopefully you) better understand how transformers are actually implemented.

Features:

This repo contains the following files and features: