google-deepmind / optax

Optax is a gradient processing and optimization library for JAX.
https://optax.readthedocs.io
Apache License 2.0
1.65k stars 181 forks source link

[DOC] Add to the gallery an example on a small language model #866

Closed copybara-service[bot] closed 6 months ago

copybara-service[bot] commented 6 months ago

[DOC] Add to the gallery an example on a small language model

This example demonstrates how to train a small-scale transformer-based language model (inspired by NanoGPT) on the Tiny Shakespeare dataset. The core idea is to train a model that can predict the next character in a sequence of text based on the characters that came before it.