Improve Python linting situation (pin versions)

allisonkarlitskaya commented 9 months ago

We currently run ruff out of:

the unit-tests container
the tasks container

and recently had to deal with a pain in the neck of an upgrade which changed a bunch of options (adding lint to section names). The change was mutually-incompatible because the complaints to stderr got interpreted by test/static-code as a failure.

We ended up needing to upgrade the tasks container in a hurry, which itself broke a bunch of stuff, including linting in all of the other projects.

We need a better approach here. For the node-based linters, we get them as part of our version-locked-in node_modules, and have direct support over upgrading those. For Python we could do something similar via requirements.txt or PEP 621 dependencies, but we don't.

I've done a couple of rounds of thinking up an idea workflow for "how do we create a venv with a bunch of dependencies installed in it in order to run a test command" before realizing that I don't need to invent tox, because it already exists.

Here's some other info I've discovered from research so far, so we don't lose it:

dependabot supports updates of requirements.txt and PEP 621 dependencies
requirements.txt has a mechanism for specifying permitted hash values of packages and this mechanism is supported by dependabot, but as far as I can tell, PEP 621 doesn't have anything for this.
ideally we'd specify the exact versions of our linters via PEP 621 optional-dependencies but both tox and pip are missing support for installing just the dependencies, without installing the package itself.
- pip has this issue: https://github.com/pypa/pip/issues/11440
- tox has this issue: https://github.com/tox-dev/tox/issues/2301
  - someone went ahead and implemented a plugin: https://github.com/sclabs/tox-extras/blob/main/tox_extras.py
  - I tried dropping this into our source tree as toxfile.py and it more or less worked.
all of this is causing me to lean towards using something like requirements-lint.txt or so as the easier solution at this point in time
tox is happy to install from a requirements.txt-type file: you just list -rfile.txt as one of your dependencies. I haven't tested trying to enable "secure install mode" yet, although the docs suggest that one half of it comes automatically as soon as pip sees even one --hash in the requirements file.

martinpitt commented 9 months ago

My two cents: We mostly had/have this problem with cockpit.git, as ruff runs in the unit-tests container which gets updated at a pace independently from cockpit/tasks. It should never break spontaneously though, as unit-tests-refresh runs all the unit tests and is supposed to stop the container update if anything fails. I wasn't here when that thing happened, but if it did happen spontaneously, then the tests in -refresh are broken.

That said, I'd much prefer a single source of truth. All other projects run ruff (and test/static-code) in our cockpit/tasks container, so we could just do the same for our unit tests. cockpit/tasks even has cockpit's build deps installed (for the very reason that we want to build it both in CI and also as developers in toolbox).

So the smallest possible move would be to run all x86 unit tests in cockpit/tasks instead of our custom unit-tests container (done in PR #20005), and figuring out a robust environment for tox instead of pip installing it all the time.

That may also be the final nail in the coffin for testing on i386 -- frankly, it's almost useless these days. New stuff gets written in Python, nobody cares about i386 any more, and we run the unit tests during rpm build in packit, so we already have i386/aarch64/s390x coverage there. update: done in PR #20005

I must say I don't like the requirements.txt/tox/venv overhead at all. Too many moving parts, introducing yet more stuff to download all the time (aside from cluttering up $HOME). If fixing our projects for new ruff versions is that much of a problem (it hasn't been in my experience), we could install multiple versions into the tasks container as well (it's just a single static binary after all), and let projects choose (i.e. test/static-code could call a particular ruff-$VERSION and read it from test/ruff-version or so) -- but honestly it feels like overkill to me.

martinpitt commented 9 months ago

Another thing: So far our maintenance and update process for the tasks container sucks. It'd be much better to auto-refresh it weekly, push to a :staging tag, and then create a couple of "test new container" PRs (with a staging label) against our projects to sort out the failures without the "OMG we need to unbreak everything now". We already have this problem due to new browser versions and other things (chrome-devtools-protocol versions etc.), so adding new ruff into the mix doesn't make the problem fundamentally worse.

Building this "staging" infra isn't terribly hard: We need a new PR label, tests-scan to look at it and push stuff into staging-* AMQP queues, and bots to listen to the queues. To not make this inefficient we should finally move to launching one single-shot container per task, instead of iterating a loop inside of it. We have also wanted this forever for e.g. this problem, but it never bubbled up to the top of the list.

martinpitt commented 9 months ago

we should finally move to launching one single-shot container per task

Some notes:

Setting up a "queue controller" on the host shouln't take much. Mostly python (which we already need for Ansible), and python-pika for connecting to AMQP, everything else is included batteries.
If pika is a problem (CoreOS? But there we could still do pip install), we don't have to introduce assumptions on the host: we can run the "queue controller" in a tasks container, and give it access to the host's podman with
```
podman run -it --rm -v /run/podman/podman.sock:/run/podman/podman.sock:Z -u root quay.io/cockpit/tasks bash
```
Inside, dnf install --setopt=install_weak_deps=False podman, then you can do podman --remote images or any other podman command (I tested launching a container).
With that we can run a main loop in the queue controller, let it run up to TEST_JOBS parallel containers. Most of cockpit-tasks main loop code is cleanup, all of which can disappear with one container per job. It's essentially just calling run-queue.
We could give run-queue some --exec-prefix option to wrap the queue command into a podman [--remote] run ... command with the appropriate options. Let's maintain that script in cockpituous. It can also take the code from cockpit-tasks for initializing the secrets in the container.

martinpitt commented 8 months ago

We've spent a long time building the new CI infrastructure, but note that our recent encounter with this in #20149 was completely unrelated to our infra -- it started happening in the "tox" workflow, which dynamically installs the latest pip version from scratch. We need a solution for that, too, i.e. preferably run that in the tasks container too, and make it part of the unit tests?

allisonkarlitskaya commented 8 months ago

Yes. Since we follow this logic that the container is our single authority, we should read the container file from git and run tox in that container.

allisonkarlitskaya commented 8 months ago

https://github.com/cockpit-project/cockpit/pull/20154 went a long way towards addressing this. The last thing left here is to make sure we run our unit-tests workflows on github inside of the version of the container we find in .cockpit-ci/container.

cockpit-project / cockpit

Improve Python linting situation (pin versions) #19986