jobs: add `bodhi-trigger` job

jlebon commented 1 year ago

This job is a step towards better integrating our CI into the rest of Fedora QA.

The job triggers on Fedora messages from Bodhi whenever an update is created or edited. It then resolves the Bodhi update to a relevant FCOS stream, and trigger the test-override job with the right parameters.

The implementation details here are a bit messy. Triggering on "a new Bodhi update" isn't as simple as I would've liked and mapping back to streams is a bit hairy, but it works!

There's of course a lot of cleanups we could do here (many are marked in the code), though I'm hoping we can get this in to gain some early experience and work on cleanups in parallel with the larger task of adding reporting and socializing it.

jlebon commented 1 year ago

I've tested this successfully in CoreOS CI:

AdamWill commented 1 year ago

on the whole, looks good for a first shot.

there is an awkward period of the release which this won't handle, though: the period between the "branch point" and the "updates-testing activation" point. That is, this period. At that time, the release has branched - so it's not Rawhide any more - but it still behaves almost exactly like Rawhide in Bodhi. Updates go straight from pending to stable (so long as they pass gating tests), there is no updates-testing. So to trigger tests on those updates you have to use the update.status.testing.koji-build-group.build.complete message, just like for Rawhide updates.

That will be rather difficult to handle with the approach you're using, I think :/ the approach openQA uses is to just always trigger on either message, but trigger in such a way that it only runs jobs if they haven't already been run. I don't know if that's something your scheduler can do.

jlebon commented 1 year ago

on the whole, looks good for a first shot.

there is an awkward period of the release which this won't handle, though: the period between the "branch point" and the "updates-testing activation" point. That is, this period. At that time, the release has branched - so it's not Rawhide any more - but it still behaves almost exactly like Rawhide in Bodhi. Updates go straight from pending to stable (so long as they pass gating tests), there is no updates-testing. So to trigger tests on those updates you have to use the update.status.testing.koji-build-group.build.complete message, just like for Rawhide updates.

That will be rather difficult to handle with the approach you're using, I think :/ the approach openQA uses is to just always trigger on either message, but trigger in such a way that it only runs jobs if they haven't already been run. I don't know if that's something your scheduler can do.

Ahh thanks, this is a good insight. We have the branched stream in FCOS once Rawhide is branched (e.g. https://github.com/coreos/fedora-coreos-pipeline/pull/904). So one hacky approach is to key off of whether the branched stream is enabled and extend the regex of the first trigger to be (rawhide|f$N) if so.

AdamWill commented 1 year ago

well sure, but then the question becomes, when do you stop?

once the branched release reaches the "updates-testing activation" stage, it behaves like a stable release - updates go via updates-testing and you'd want to trigger on update.request.testing because you won't get a update.status.testing.koji-build-group.build.complete immediately any more.

jlebon commented 1 year ago

Gotcha. What's the source of truth for whether updates-testing was activated? A config file somewhere for Bodhi? We could query it.

the approach openQA uses is to just always trigger on either message, but trigger in such a way that it only runs jobs if they haven't already been run. I don't know if that's something your scheduler can do.

We could do something like this too (e.g. query the jobs that were triggered). I was hoping to avoid this kind of state management but if the above doesn't pan out, we can certainly do this.

AdamWill commented 1 year ago

What's the source of truth for whether updates-testing was activated?

oh, good question. I think there may a property in the Bodhi release data you can use - composed_by_bodhi. I'm not 100% sure, but I think that pretty much follows this process - it's false for things that are acting like Rawhide (no updates-testing, direct push to 'stable' when tests pass) and true for things that are acting like stable releases (updates-testing active). I'm not 100% sure but I think that's how it goes.

dustymabe commented 1 year ago

Thank you for working on this!

AdamWill commented 1 year ago

Sorry, I think maybe I didn't mention Bodhi has an API you can use to check that property. Get https://bodhi.fedoraproject.org/releases/?rows_per_page=500 with content-type set to JSON and you get data on each release Bodhi knows about (until we have 500 of them...right now we have 65, so that won't be for a while). Check this simple test script:

#!/usr/bin/python

import requests

rels = requests.get("https://bodhi.fedoraproject.org/releases/?rows_per_page=500").json()["releases"]
rels = [rel for rel in rels if rel["id_prefix"] == "FEDORA" and rel["state"] in ("current", "pending", "frozen")]
for rel in rels:
    print(rel["version"])
    print(rel["composed_by_bodhi"])

output:

eln
False
39
True
40
False
37
True
38
True

(edit: whoops, test script had a bug)

jlebon commented 1 year ago

OK, updated this now to handle https://github.com/coreos/coreos-ci/pull/49#issuecomment-1769679589!

coreos / coreos-ci

jobs: add `bodhi-trigger` job #49