49 static allocations - Githubissues

Arrays are now statically allocated. This has allowed for numerous optimisations that lead to a dramatic speedup. Training smolGPT to a loss of 0.5 takes ~1000s. Which is bordering on pytorch speeds (a proper comparison to come later). Additionally, memory usage has been dropped considerably and training now requires around 5Gb.

bclarkson-code / Tricycle

49 static allocations #53