taskcluster / taskgraph

Generates task dependency graphs for Taskcluster CI
Mozilla Public License 2.0
16 stars 42 forks source link

Move `image_builder` docker image into Taskgraph #155

Open ahal opened 1 year ago

ahal commented 1 year ago

Currently we use a special Docker image to build all the other Docker image tasks. But for historical reasons, this image lives in mozilla-central. Now that this repo is the repo of record, and since image tasks are a core feature of Taskgraph, we should migrate the image generation over to here.

ahal commented 5 months ago

@jcristau and I were having a bit of debate over this issue. I'll paste our transcript from Matrix here:

[jcristauJ 02:38 argh sorry i meant to comment on that PR 02:39 i'm skeptical that moving it out of central is a good idea. i think image_builder living in-tree is valuable. 08:27 happy to chat and try to explain, or i can write down some thoughts on the pr. but basically in m-c people other than releng can make changes and have them reflected automatically, there's a lot more users (both in terms of devs and downstream docker images) there than anywhere else, and it can get review from build peers that are at least as much of a stakeholder as us. 08:29 but also i'm not sure what having it in taskgraph buys us in practice

[ahal] 09:13 my plan for Taskgraph is to have a suite of images that all get automatically built + pushed to docker hub on release. We already have the decision and run-task images, my plan was to add image-builder and index to the mix. Ideally I'd also like to refactor it so that dependabot can handle dependency upgrades. Basically I want to make sure it never gets this outdated again.

I guess from a philosophical point of view, your argument would apply to Taskgraph itself. Maybe you feel we should have moved Taskgraph back into gecko_taskgraph rather than the other way around, which is fair.. but also that ship kind of already sailed 😛

So at this point, I think it makes the most sense for image_builder to live beside its sole* consumer

* once nss switches to Taskgraph and gecko_taskgraph.docker is merged into upstream

09:20 oh, also I was planning to make Taskgraph pin to the same version of the image as itself. So consumers would upgrade image_builder by upgrading Taskgraph (I haven't worked through the mechanics of that release process wise though)

[jcristauJ 09:22 to some extent it does apply to taskgraph, yes. but at least with taskgraph it's possible for gecko_taskgraph to do its own thing where it needs to. i fear that moving image_builder out of m-c would make it harder to confidently make changes to it, or would mean duplication. right now image_builder itself is built with the docker hub version, but all downstream images in m-c are built off the in-tree image_builder.

ahal commented 5 months ago

Those are good points. So for moving image_builder to Taskgraph we have:

Pros:

Cons:

ahal commented 5 months ago

Looking at the commit history glandium is the only person still at Mozilla who's made changes here

His last commit was also 4 years ago. Tbh, to me this is a sign that it's current location is not great

Tasks currently get the image_builder image from a build-docker-image artifact, this move would complicate that

Gecko could still define an in-tree Docker image that uses FROM mozillareleases/taskgraph-image-bulder:v8.2.0. This way tasks can continue to use an in-tree image for it and Gecko could pin to an image independent of the Taskgraph version (if we wanted to).

glandium commented 4 months ago

His last commit was also 4 years ago. Tbh, to me this is a sign that it's current location is not great

Last commit from 4 years ago is a reflection of the fact that we haven't needed to touch it because it just works. Every time we touch is spins everything, so it's better to avoid it unless really necessary. There hasn't been a case where we needed to change anything in the last 4 years.

But for historical reasons, this image lives in mozilla-central.

Well, yes an no. The full history is that the image definition was always in mozilla-central, and uploaded do docker hub. But back when we needed to make changes to the image, it was too painful to deal with the docker hub updates (also, it's not self-serve, and can't be tried on try). This is when we ended up with the current approach, which is to use the latest version that had been uploaded to docker hub to create the in-tree one. And all the other images use that one as a builder.

Needless to say, with no update in 4 years, it means the current image could have been uploaded to docker hub and used by m-c, and the current setup undone.

(I'm not even sure the image that is currently on docker hub uses kaniko)

BUT. Had we had done that, bug 1901033 would have been much harder (although arguably if the taskgraph repo has a setup to build the image on PRs, it wouldn't be that bad, but back when this setup was put in place, I'm not even sure taskcluster/github integration was a thing).

ahal commented 4 months ago

Thanks for the context. And yeah, totally understandable why things are the way they are. I suppose there's room for more automation here regardless of where the image lives.

This isn't something I have a ton of motivation to push on atm, so if we're not all in agreement I'll probably just shelve the PR until this comes up again.