Project for training small LMs. Designed for training on SimpleStories, an extension of TinyStories.
From the root of the repository, run one of
make install-dev # To install the package, dev requirements and pre-commit hooks
make install # To just install the package (runs `pip install -e .`)
Suggested extensions and settings for VSCode are provided in .vscode/
. To use the suggested
settings, copy .vscode/settings-example.json
to .vscode/settings.json
.
There are various make
commands that may be helpful
make check # Run pre-commit on all files (i.e. pyright, ruff linter, and ruff formatter)
make type # Run pyright on all files
make format # Run ruff linter and formatter on all files
make test # Run tests that aren't marked `slow`
make test-all # Run all tests
Training a simple model:
python simple_stories_train/train_llama.py --model d2 --sequence_length 1024 --total_batch_size=4096
You may be asked to enter your wandb API key. You can find it in your wandb account settings. Alternatively, to avoid entering your API key on program execution, you can set the environment variable WANDB_API_KEY
to your API key, or put it in a
.env
file under the root of the repository.