This repository facilitates my research by enforcing modularity and housing various Transformer implementations.
Experiments are conducted in their own, separate repositories.
This implementation follows the Attention Is All You Need paper.