Open alextremblay opened 3 weeks ago
You may ask yourself: how do other projects accomplish these things? answer: they don't. python monorepos are quite rare simply because it's very difficult to accomplish even half of all these requirements in a single repo, let alone all of them.
As far as I am aware, we are breaking new ground here. I'm not aware of any existing python monorepo that accomplishes as many of these goals as we do. we may very well be on the cutting edge here, for better or for worse 🫠
I think we can get a better outcomre for #1-4 by switching from rye
to uv
we need to make the switch anyway, as rye
is being deprecated and rye
's developer is recommending users move to uv
anyway. uv
now, as of the latest release, has support for workspaces, which fits our monorepo multi-package requirements quite well
also, uv
has a dependency override mechanism which which would be a huge help to us. given the massive size of our monorepo's dependency graph, it's not uncommon for two of our packages to contain transient dependencies with conflicting version constraints. when that happens, rye lock
completely crashes, and we're forced to deep dive and figure out how to untangle the mess, including sometimes forking a sub-sub-sub-dependency just to update its version constraints on the transient dependency which caused the problem. It's a mess, and uv
's dependency override mechanism may be the solution we've been looking for
The tools repository is a monorepo containing a lot of uoft projects. All of which are pip-installable packages, some of which are python libraries meant to be imported, some of which are python CLIs meant to be installed by end users and executed, many of which are both, and many of which form a dependency graph (ie nautobot depends on aruba+ssh+bluecat, scripts depends on bluecat+ssh+librenms, all of them depend on core)
For the monorepo and our python packaging / dependency management tooling, we have the following goals:
We should be able to create custom forks of external projects we depend on (to fix bugs or add features) and be able to reliably reference those forks until the changes we make get upstreamed. We should be able to track these custom forks in separate git repos (forks of the original) and add these forked repos as submodules of our monorepo
any developer wanting to contribute should be able to pull down the repo and run a command to set up a venv and install all of our packages as well as all their dependencies, pinned to specific versions (through use of a lock file)
any end user who wants to install any of our projects should be able to pull down the repo and
pipx install projects/<name>
and have it just workuoft_core
, pip/pipx should installuoft_core
from path "../core" relative to the aruba project folder, instead of installinguoft_core
from PYPI, for examplenautobot
, interpret that dependency asnautobot @ git+https://github.com/utsc-networking/nautobot
)users should be able to install any of our projects from any branch of our git repo by calling
pipx install git+https://uoft-networking/tools@branch_name#subdirectory=projects/<name>
and have it behave the same way as #3we should be able to build python wheels suitable for publishing to pypi, which reference dependencies by bare name instead of by relative path, and which reference custom-fork dependencies by git URL (as described in #3.2 above)
optional bonus I would love to be able to restructure our monorepo so that all code for all projects lives in a single source tree and gets automatically broken up into PEP420-style namespaced packages (ie instead of having a package called
uoft_core
whose code lives inprojects/core/uoft_core
and a package calleduoft_aruba
whose code lives inprojects/aruba/uoft_aruba
, I'd love to have a package calleduoft.core
whose code lives insrc/uoft/core
and a project calleduoft.aruba
whose code lives insrc/uoft/aruba
each of these is easy to accomplish, but accomplishing ALL of them together is extremely hard.
To accomplish #1 and #2, we use
rye
, which automatically installs all projects inprojects/*
andcustom-forks/*
in editable mode. The downside to this is that all developers on our monorepo must install every project and every dependency in their venv, even if they only want to work on one small projectTo accomplish #3 and #4, we've structured all of our projects as PEP517-compliant python packages, with project metadata defined in a
pyproject.toml
that lives alongside each project. each of these packages uses a PEP517 build backend calledhatchling
to tell pip/pipx how to install the package, and each project has apyproject.py
hooks file that gets called by hatchling, allowing us to automatically convert bare dependencies on monorepo projects into relative reference dependencies. for example, the aruba project'spyproject.toml
declares thatuoft_aruba
depends onuoft_core
. When pip/pipx installsuoft_aruba
,projects/aruba/pyproject.py
is triggered, and it automatically rewrites thatuoft_core
dependency intouoft_core @ ../core
To accomplish #5, I've tried to add logic to the
pyproject.py
hook files to not rewrite dependencies when building wheels, but it does not work and is difficult to debug, so more work is needed there6 WOULD be possible / accomplishable, but not in a way that's compatible with #1, #2, #3, or #4. The only way i can think of to accomplish that in a compatible way would be to replace hatchling, pyproject.py files, and per-project pyproject.toml files with a custom in-repo PEP517 build backend. the idea is complicated and still not fully-formed in my mind, but it's there