freedomofpress / securedrop-builder

Packaging logic for building SecureDrop-related Debian packages
GNU General Public License v3.0
7 stars 11 forks source link

Support SDW components that manage dependencies using poetry #459

Closed legoktm closed 1 year ago

legoktm commented 1 year ago

We parse requirements/requirements.txt in a few places, those need to be updated to parse poetry.lock. This should be easier since it's a TOML file, but also that means we need a TOML library for bullseye since it's only in the standard library with bookworm.

Places needing updates (may not be complete):

eloquence commented 1 year ago

I'd be game to take a stab the necessary scripting changes identified here, unless you or another team member prefer to work on it.

To recap my understanding:

I would suggest implementing this, for now, in a backwards-compatible manner, so we don't have to migrate all repositories to Poetry at once. We could maybe try moving a small repo like securedrop-proxy over, allowing us to test the tooling changes. Once all components use Poetry, we can remove the backwards-compatibility code.

eloquence commented 1 year ago

I've got a poetry branch in securedrop-proxy. Next up I'll see if I can successfully build Debian packages from the poetry.lock requirements definition, building on your wip/poetry branch in this repo.

One thing I'm wondering is if we should migrate from setuptools to Poetry as a build system as part of switching to Poetry, but we can discuss that separately, for now I'm leaving the setuptools bits in place.

legoktm commented 1 year ago

One thing I'm wondering is if we should migrate from setuptools to Poetry as a build system as part of switching to Poetry, but we can discuss that separately, for now I'm leaving the setuptools bits in place.

Whatever ends up being simpler I think. Normally I'm in favor of jettisoning extra dependencies, but setuptools is key enough to the entire Python ecosystem that we're going to end up depending on it regardless. Also setuptools has a long history of being stable and reasonably well maintained.

build-sync-wheel - passes requirements.txt over to pip download

I was thinking we could just do pip download foo==0.0.0 bar==0.0.0 etc. but we've had issues with pip download in the past (e.g. https://github.com/freedomofpress/securedrop-builder/issues/457), so long-term we could also just download the tarballs ourselves? pypi URLs aren't easy to auto download from, but Debian has a redirector set up, so stuff like wget https://pypi.debian.net/setuptools/setuptools-68.1.2.tar.gz works.

eloquence commented 1 year ago

but also that means we need a TOML library for bullseye since it's only in the standard library with bookworm.

Turns out we're already installing the older toml library via build in the bootstrap logic: https://github.com/freedomofpress/securedrop-builder/blob/main/workstation-bootstrap/requirements.txt#L91

I've tried parsing securedrop-proxy's poetry.lock file with both toml, tomli and tomllib and the result is always the same, byte for byte. That said, I'd suggest we use tomli as a transitional library (until Bookworm/Python 3.11 gives us tomllib), since toml is still a developer release and does not seem to have fully implemented the TOML 1.0 spec (https://github.com/uiri/toml/issues/300). build itself appears to have switched to tomli.

eloquence commented 1 year ago

poetry-support has my current WIP; I've focused on the verify-hashes script for now since it's needed to get any build working. With those changes, and the only-poetry branch in securedrop-proxy, I'm able to build a securedrop-proxy package. I'll focus on the other scripts next, building on your existing WIP work.

I ended up just sticking with toml on Python 3.9 since it doesn't really seem worth adding an extra dependency.

As I go I'm updating some parts of the documentation and code comments that seem outdated or potentially confusing (e.g., I'm avoiding references to a "PyPI mirror" since I don't think that's the best terminology to describe our use of a local wheels directory without an index).

eloquence commented 1 year ago

Progress update:

Side quest:

eloquence commented 1 year ago

OK, I'm happier with utils.py now. Next up: build-sync-wheels, then I'll take a pass through the tests.

I was thinking we could just do pip download foo==0.0.0 bar==0.0.0

I'll try that for now.

eloquence commented 1 year ago

I took the easy way out for the source download step: poetry export seems to work just fine to generate the requirements.txt format. By default this excludes dev dependencies. The approach doesn't seem unreasonable to me; we're just asking poetry to speak pip's language when we use pip. In contrast, where we interpret the requirements ourselves, we parse Poetry's format directly.

This means we'll want poetry in the boostrap as well, but we'll probably need that sooner or later anyway. That's not in the branch yet.

From reading about pip download's surprising behavior (https://github.com/pypa/pip/issues/1884), I agree entirely that we should switch to direct download and verification. Having the download step execute package-specific code (!) seems unnecessary and likely to bite us again in future in the form of random errors. I would suggest we track this separately.

CI is now fully green on poetry-support but I want to expand the tests a bit to cover the new behavior before, finally, opening a couple of PRs. In the meantime, please don't hesitate to let me know if the approach in the poetry-support branch raises any concerns or if I've missed any major functionality.

eloquence commented 1 year ago

This means we'll want poetry in the boostrap as well,

.. he said foolishly, taunting the dependency demons. #464 looks like it'll be a good chunk of work and we may want to have "the conversation" first, of whether we really need to localwheel our localwheel-building environment (:xzibit:). In any event, I just added a smol utility function for now that should do the trick.

legoktm commented 1 year ago

poetry has a lot of dependencies (comparatively) as it doesn't vendor them like pip does, which means adding it into the bootstrap is going to be a lot of work. I would suggest we avoid needing to bootstrap poetry by just installing it from Debian packages.

The only problem with doing so is it only shipped in bookworm, so we either wait for us to drop bullseye support or implement #460 in a way that lets us run different steps of the process with different containers/OSes. So maybe in the short term we just try to avoid running poetry? :/

eloquence commented 1 year ago

Yeah, that's the conclusion I landed on as well - poetry-support branch now does the pip download step without using poetry. I agree, if we need it down the line in this repo, using the Bookworm version makes sense to me.