cockpit-project / cockpit

Cockpit is a web-based graphical interface for servers.
http://www.cockpit-project.org/
GNU Lesser General Public License v2.1
11.34k stars 1.12k forks source link

Improve Python linting situation (pin versions) #19986

Closed allisonkarlitskaya closed 8 months ago

allisonkarlitskaya commented 9 months ago

We currently run ruff out of:

and recently had to deal with a pain in the neck of an upgrade which changed a bunch of options (adding lint to section names). The change was mutually-incompatible because the complaints to stderr got interpreted by test/static-code as a failure.

We ended up needing to upgrade the tasks container in a hurry, which itself broke a bunch of stuff, including linting in all of the other projects.

We need a better approach here. For the node-based linters, we get them as part of our version-locked-in node_modules, and have direct support over upgrading those. For Python we could do something similar via requirements.txt or PEP 621 dependencies, but we don't.

I've done a couple of rounds of thinking up an idea workflow for "how do we create a venv with a bunch of dependencies installed in it in order to run a test command" before realizing that I don't need to invent tox, because it already exists.

Here's some other info I've discovered from research so far, so we don't lose it:

martinpitt commented 9 months ago

My two cents: We mostly had/have this problem with cockpit.git, as ruff runs in the unit-tests container which gets updated at a pace independently from cockpit/tasks. It should never break spontaneously though, as unit-tests-refresh runs all the unit tests and is supposed to stop the container update if anything fails. I wasn't here when that thing happened, but if it did happen spontaneously, then the tests in -refresh are broken.

That said, I'd much prefer a single source of truth. All other projects run ruff (and test/static-code) in our cockpit/tasks container, so we could just do the same for our unit tests. cockpit/tasks even has cockpit's build deps installed (for the very reason that we want to build it both in CI and also as developers in toolbox).

So the smallest possible move would be to run all x86 unit tests in cockpit/tasks instead of our custom unit-tests container (done in PR #20005), and figuring out a robust environment for tox instead of pip installing it all the time.

That may also be the final nail in the coffin for testing on i386 -- frankly, it's almost useless these days. New stuff gets written in Python, nobody cares about i386 any more, and we run the unit tests during rpm build in packit, so we already have i386/aarch64/s390x coverage there. update: done in PR #20005

I must say I don't like the requirements.txt/tox/venv overhead at all. Too many moving parts, introducing yet more stuff to download all the time (aside from cluttering up $HOME). If fixing our projects for new ruff versions is that much of a problem (it hasn't been in my experience), we could install multiple versions into the tasks container as well (it's just a single static binary after all), and let projects choose (i.e. test/static-code could call a particular ruff-$VERSION and read it from test/ruff-version or so) -- but honestly it feels like overkill to me.

martinpitt commented 9 months ago

Another thing: So far our maintenance and update process for the tasks container sucks. It'd be much better to auto-refresh it weekly, push to a :staging tag, and then create a couple of "test new container" PRs (with a staging label) against our projects to sort out the failures without the "OMG we need to unbreak everything now". We already have this problem due to new browser versions and other things (chrome-devtools-protocol versions etc.), so adding new ruff into the mix doesn't make the problem fundamentally worse.

Building this "staging" infra isn't terribly hard: We need a new PR label, tests-scan to look at it and push stuff into staging-* AMQP queues, and bots to listen to the queues. To not make this inefficient we should finally move to launching one single-shot container per task, instead of iterating a loop inside of it. We have also wanted this forever for e.g. this problem, but it never bubbled up to the top of the list.

martinpitt commented 9 months ago

we should finally move to launching one single-shot container per task

Some notes:

martinpitt commented 8 months ago

We've spent a long time building the new CI infrastructure, but note that our recent encounter with this in #20149 was completely unrelated to our infra -- it started happening in the "tox" workflow, which dynamically installs the latest pip version from scratch. We need a solution for that, too, i.e. preferably run that in the tasks container too, and make it part of the unit tests?

allisonkarlitskaya commented 8 months ago

Yes. Since we follow this logic that the container is our single authority, we should read the container file from git and run tox in that container.

allisonkarlitskaya commented 8 months ago

https://github.com/cockpit-project/cockpit/pull/20154 went a long way towards addressing this. The last thing left here is to make sure we run our unit-tests workflows on github inside of the version of the container we find in .cockpit-ci/container.