compserv / hknweb

The new HKN website (using Django, hopefully at hkn.eecs.berkeley.edu soon)
https://dev-hkn.eecs.berkeley.edu
MIT License
19 stars 108 forks source link

Modernizing tooling for future development #554

Closed oliver-ni closed 11 months ago

oliver-ni commented 12 months ago

Modernizing tooling for future development

relevant slack thread

Replacing conda + requirements.txt with Poetry

Currently, conda is used to manage requirements, and it installs requirements listed in a requirements.txt file. Because conda is mostly an environment manager and not a dependency manager, it doesn't work well for this use case — for one, it's very susceptible to breakage because dependencies aren't locked.

Furthermore, I was unable to get the project even running on my machine without much effort wrangling the environment.

Poetry is the modern standard for Python dependency management these days. It handles locking dependencies to ensure reproducible builds, makes adding and updating dependencies easy, and actually does dependency resolution properly to ensure we have compatible packages, so I've switched us over to that.

It's pretty simple to use — just poetry install to, unsurprisingly, install dependencies, and then poetry shell drops you into the project's virtualenv. It doesn't conflict with your Python installation and/or outside-of-Python stuff (as conda does due to its nature as an environment manager).

Disclaimer that you can use Poetry with conda if you want. Really, they're designed to do different things. You can use conda to manage your Python versions / global environments and Poetry to manage project dependencies.

Bringing dependencies up-to-date

Since I changed the system used for dependency management, I decided to go through and update all our dependencies too (well, some of them weren't pinned properly in the first place, so I didn't even know which versions those were running).

Django is now at version 4.2 LTS. I also updated bleach, gunicorn, django-autocomplete-light, and djangorestframework. I removed the dependency on whitenoise and pillow which were not being used. Additionally, I noticed that someone seemed to have started implementing social auth, but never finished it, so I stripped that code and removed the dependency on social-auth-app-django. If we want to do that, we can start again in the future.

I've gone through all the release notes and updated our code where libraries had breaking changes. With some preliminary testing, I'm pretty sure everything works, but it would be nice if someone else could test a bit too.

This is not only a nice touch to keep the site secure and updated but also a lot of the versions we were on were plainly just not compatible with my system no matter what I tried (especially since I am on Apple Silicon). I want to get the repository to a clean, workable state so it's easy for us to work on the site.

Updating the deployment/ci pipeline

Replacing conda affects deployment as well. I've installed Python 3.9 and Poetry outside of conda on OCF apphost, and updated all the deployment, run scripts, and CI workflow to use Poetry instead.

The commands are still the same: fab deploy with the HKNWEB_MODE environment variable set to either dev or prod to deploy to a local server or the production OCF server respectively.

Additionally, I reworked the GitHub Actions workflow to automatically deploy after running tests. This should make it a lot easier to update the website as every push to master will automatically deploy without needing to manually run the script. You can also see the status of the deployment on GitHub due to the new-ish environments feature.

Adding an optional Nix flake

Nix is a cool build system that puts an incredible emphasis on declarative, reproducible builds as well as declarative configuration. I've added an optional flake in the root of the repository. The flake reads the Poetry configuration and builds a development shell with all the dependencies (including native dependencies!) needed to run the project automatically.

I'm not expecting everyone to adopt Nix, which is why it's not required to run the project. The way I set it up, the Nix config is derived from the Poetry config, so you can just use Poetry directly. It's just a convenience for those who use Nix.

But Nix is cool! You should check it out :) :)


For reviewers

Hi! If you're able to take the time to review, let's try setting up a dev environment with the new tooling :)

  1. Install Python 3.9 (yes, this project requires this specific version :angry:) and Poetry
  2. Clone this repository and switch to the modernization branch
  3. Run poetry install[1]
  4. Run poetry shell to drop into the project's virtualenv
  5. Run python manage.py migrate
  6. Run python manage.py runserver and you should have a working server at http://localhost:8000!

If this doesn't work for you, please let me know and let's iron out the issues :)

I still need to update the Wiki but I don't think that's tracked in Git, so I'll update it when the PR is merged.

[1] This should usually work. However, if Poetry can't find the right version of Python, you may need to push poetry with poetry env use /path/to/python3.9.

ochan1 commented 12 months ago

Thanks, Oliver

Some history and how we use Conda: We use Conda since we couldn't install Python 3.9 on the OCF server and Conda allowed us to have as much of a separate environment as possible without a VM (which was not reliable a few years later).

Now that OCF has Python 3.9, the question is if we still want Conda. In the PR description and code skimming, sounds like Conda should no longer be used. We can consider a few years ago where like people can say "yeah, we don't need to easily upgrade, it won't matter much down the line" until it's too late. Now that we want to update to Python 3.9, Conda allowed that flexibility and option without going through OCF. Maybe in 5 years from now we need to upgrade to Python 3.11 and we wait and wait for OCF, but with Conda we can upgrade ourselves right away.

In summary for Conda:

At least you were able to try on OCF which is a good thing too

I'll give this a try later if this is the decision to use this among the officers

ochan1 commented 12 months ago

Additionally, in regards to locked dependencies, it is locked in the pip file with "==" for many of the dependencies

If they aren't, they should

oliver-ni commented 12 months ago

Thanks @ochan1. Some comments:

Some history and how we use Conda: We use Conda since we couldn't install Python 3.9 on the OCF server and Conda allowed us to have as much of a separate environment as possible without a VM (which was not reliable a few years later).

That makes sense as to the history of conda. If that's the case, we can continue to use conda to provide the right Python version, but use Poetry for dependencies. Or, we can just build Python from source (that's what I did & what it's running on right now). It's only a little more finnicky.

Conda would be an environment detail on apphost only but not mentioned anywhere in the codebase hopefully.

From the OCF side, we're working on upgrading apphost or coming up with a new container-based apphost solution soon, though, so maybe this won't matter in a few months.

Also, Poetry takes care of standardizing the dev environments — like conda, it manages a venv for each project that contains isolated dependencies from the rest of your system. It's actually more reproducible since everything's properly locked now.

Additionally, in regards to locked dependencies, it is locked in the pip file with "==" for many of the dependencies

This is true for direct dependencies, but Poetry takes care of locking transitive dependencies (of which there are like 30) to specific versions as well, in the poetry.lock file. With requirements.txt, if you want a truly reproducible build, you have to list out those as well, which we weren't doing. Poetry makes it effortless.

Also, Poetry does resolves dependencies in a more sophisticated way than pip alone (it makes sure that everything is compatible and looks around for combinations that work when they're not).

Overall, it's just convenient. There will always be pain points with dependencies, but IME Poetry makes it suck the least.