alan-turing-institute / python-project-template

15 stars 3 forks source link

Template for Python projects

This setuptools-based template is designed to help you get started with a new Python project, or migrate an existing codebase. It includes:

Based on the Scientific Python project template.

Sections in this README

Setting up a new project

To use, install copier in your Python environment:

python -m pip install copier

Then, run the following command to start the template configuration (but replace my-package-name with the name of your package):

copier copy gh:alan-turing-institute/python-project-template my-package-name

The output will be created in a folder called my-package-name, and will be created if it doesn't exist.

You will be prompted for the following information:

Great! Copier will have now created a new project in the directory you specified by replacing my-package-name, and customized it based on the information you provided.

Your new project will have been set up with the following structure:

my-package-name/
├── .github/
│   └── workflows/
│       └── ci.yml
│      └── cd.yml
├── .gitignore
├── .pre-commit-config.yaml
├── CONTRIBUTING.md
├── CODE_OF_CONDUCT.md
├── LICENSE
├── README.md
├── pyproject.toml
├── src/
│   └── my_package_name/
│       └── __init__.py
└── tests/
    └── __init__.py

Here's a brief overview of the files and directories that have been created:

Migrating an existing project

If you're taking code you've already written and want to use this template, you'll need to perform the following steps:

Once you've set up your new project, you can start developing your package. There are some guidelines for development included in the CONTRIBUTING.md file generated in your project, but the main things are repeated below.

Python environment management

Note: I will not be covering Conda environments in this section. Conda is great when your project has dependencies outside of Python packages that you want to manage! If you're using Conda, you can still use this template – these are just recommendations for managing Python environments that don't affect the package itself.

Every project should have a Python environment set up to manage dependencies. You can use the built-in Python tool venv to create a new environment in the root of your project. To create an environment called .venv, run the following command in your terminal:

python -m venv .venv

Then, activate the environment:

Your version of python should now point to the one in the .venv directory. You can check this by running: which python, which should return a path to the python executable in the .venv directory (e.g. my-package-name/.venv/bin/python).

Before installing your package to work on, I would also recommend upgrading pip and setuptools in your environment to the latest versions, which could potentially avoid issues with installing dependencies later on. You can do this by running the following command in your terminal while the environment is activated:

pip install --upgrade pip setuptools

Aside: Using uv for environment management

There's also the environment manager uv, which has gained a lot of traction through being really fast, including some fun extras (e.g. this one that could save your old projects that no longer work with newer package versions).

To install uv, follow the install instructions on the GitHub page, then run the following command in your terminal from the root of your project:

uv venv --seed

This will create a new virtual environment in the .venv directory, with the --seed option upgrading pip and setuptools to the latest versions automatically.

Activating the environment is the same as with venv:

Then, to benefit from uv's speed, you install your dependencies with:

uv pip install ...  # as you would with pip!

Installing your package in editable mode

After your environment is set up, you can then install your project in editable mode (so that changes you make to the code are reflected in the installed package) by running:

pip install -e .  # or uv pip install -e . if you're using uv

This will install your package in editable mode, so you can import it in other Python code through import my_package_name and access methods as if it were any other package. The editable part means that if you make changes to the code in the src directory, they will be reflected in the installed package, so your changes will be immediately available to any code that uses your package that you're working on.

Wrtiting code and running tests

You're now ready to start developing your package! Add code to the src directory, tests to the tests directory, and run your tests with the pytest command to make sure everything is working as expected. Settings for pytest are included in the pyproject.toml file.

Additionally, the automated CI pipeline will run the tests for you, but it's a good idea to run them locally as well to catch any issues before you push your code.

Formatting and checking your code

The tools for formatting and linting your code for errors are all bundled with pre-commit. Included are:

To have pre-commit check your files before you commit them, you can run the following command:

pre-commit install

This will set up pre-commit to run the checks automatically on your files before you commit them. It's possible that pre-commit will make changes to your files when it runs the checks, so you should add those changes to your commit before you commit your code. A typical workflow would look like this:

git add -u
git commit -m "My commit message"
# pre-commit will run the checks here; if it makes changes, you'll need to add them to your commit
git add -u
git commit -m "My commit message"
# changes should have all been made by now and the commit should pass if there are no other issues
# if your commit fails again here, you have to fix the issues manually (not everything can be fixed automatically).

One thing that is worth knowing is how to lint your files outside of the context of a commit. You can run the checks manually by running the following command:

pre-commit run --all-files

This will run the checks on all files in your git project, regardless of whether they're staged for commit or not.

Publishing your package

If you're ready to publish your package to PyPI (i.e. you want to be able to run pip install my-package-name from anywhere), the template includes a GitHub Actions workflow that will automatically publish your package to PyPI when you create a new release on GitHub. The workflow is defined in the .github/workflows/cd.yml file within the project_template folder.

First, if you don't already have a PyPI account, we'll follow these first steps from the python packaging user guide:

The first thing you’ll need to do is register an account on TestPyPI, which is a separate instance of the package index intended for testing and experimentation. It’s great for things like this tutorial where we don’t necessarily want to upload to the real index. To register an account, go to https://test.pypi.org/account/register/ and complete the steps on that page. You will also need to verify your email address before you’re able to upload any packages. For more details, see Using TestPyPI.

Notice how the instructions mention TestPyPI, which is a testing environment for PyPI. This is a good place to start, since you can test publishing your package without affecting the real PyPI index. Once you're ready to publish to the real PyPI, you can follow the same steps again — just replace https://test.pypi.org with https://pypi.org.

We'll then need to set up trusted publishing, which allows us to publish to PyPI without the need for a username/password or API token. This template uses the PyPA GitHub action for publishing to PyPI, which gives us the following instructions:

This action supports PyPI's trusted publishing implementation, which allows authentication to PyPI without a manually configured API token or username/password combination. To perform trusted publishing with this action, your project's publisher must already be configured on PyPI.

After following these steps, we're basically there! You can now create a new release on GitHub using the "Releases" tab on the right-hand side, and the GitHub Actions workflow will automatically publish your package to PyPI using the commit associated with the release. Note: you'll need to update the version number in the pyproject.toml file before creating a new release, since PyPI won't allow you to publish the same version twice!

Make sure to follow semantic versioning when updating the version number. If it's your first version of the package, I'd recommend starting at 0.1.0, with the major release 1.0.0 being the production-ready product. Use minor versions (0.X.0) for breaking changes, and patch versions (0.0.X) for backwards-compatible bug fixes.

If this was your first time publishing, you will then be able to install your package from PyPI with pip install my-package-name. Yay!

Updating your project when the template changes

Copier has instructions on how to update a template to the latest version, which I'll repeat here for completeness.

If you want to update your project with the latest version of this template, you can run the following command (ensuring that your current project is committed and that you have no uncommitted changes, since the update will overwrite some files!):

copier update

Note that this is the purpose of the .copier-answers.yml file in the root of your project. This file is used by Copier to keep track of the answers you gave when you first created the project, allowing it to update the project correctly when you run copier update.

Inspiration

This template heavily draws upon the Scientific Python template to motivate choices for default options for linters, backends etc. If you haven't worked with any of the tools in this repo (e.g. pre-commit, ruff, mypy, pytest) or want to know more about Python project setup in general, then you'll benefit from reading the fantastic Scientific Python Development Guidelines.

It's worth noting that the original template also has support for many more backends, including the use of compiled extensions (e.g. in Rust or C++). The only reason this is not a fork is that I wanted to both simplify the options available, and make the hard switch to copier instead of cookiecutter as a templating engine, since it lets you natively update your project in-place to the latest template, even after you've worked on it for a while.