Proposal for new u-c-debian automation

iskunk commented 5 months ago

Hello all. At long last, I've had an opportunity to read up on / experiment with OBS, and have put together a design proposal for a release-automation system to replace the current implementation and practice in this repo.

One important difference is that the new approach will be fully automated, requiring no manual intervention except when occasional patch/dependency conflicts arise. Another is that it uses the Debian adaptation of the Chromium source as the starting point, rather than the original Google upstream. (This relieves us of the burden of adapting Chromium to Debian, which the Debian package maintainers already do.)

The entire process begins with an e-mail notification from Debian, indicating that a new version of their chromium package has been released, and ends with corresponding binary u-c packages in the OBS download areas. Here is an overview:

tracker.debian.org sends out a chromium source-upload notification to an e-mail address that we have previously subscribed on the site. The message looks like this.
Because (as I understand) we cannot initiate a GitHub Action with an arbitrary e-mail message, we have the message sent to a service like ProxiedMail where we can make it trigger a webhook.

(This can also be done with GMail, but I think everyone will agree that would not be a good idea.)
The webhook kicks off the GitHub Action here.
The GitHub Action downloads the newly-minted chromium source package from the Debian repos, either incoming.debian.org (for unstable uploads) or security.debian.org (stable security updates). This consists of three files: an ~800 MB .orig.tar.xz original source tarball, a .debian.tar.xz "debianization" tarball, and a .dsc source control file. The source package PGP signature and checksums are verified.
The GitHub Action downloads the latest u-c-debian source-conversion framework, and the u-c repo proper. If the latter does not have a commit matching the Debian package's upstream-source version, then the process stops---to be re-started later at this step when u-c is updated.

(There would also be entry points at this step to allow for fixes to u-c and/or u-c-debian, as needed.)
The GitHub Action unpacks the sources, applies the u-c source conversion, and builds a new source package from the result. (Note that the .orig.tar.xz file remains exactly the same; all the necessary changes are in the other two files.)

The newly-created source package should be signed, although how the signing key should be managed in the secrets framework remains to be determined.
(Optional) The GitHub Action performs a partial test build of the package. This is no more than initiating a build, and letting it run for e.g. twenty minutes. If the build is still going at the end of that time, then it is stopped and the test passes. This is intended to catch the significant portion of build issues that occur early on in the process.
The GitHub Action posts the newly-created .debian.tar.xz and .dsc files as a release. (The .orig.tar.xz file is not posted, not only because it's huge, but also because we didn't modify it.)
The GitHub Action prepares and pushes a new _service file to OBS. This file invokes the download_url and verify_file services for the three source package files: the .orig.tar.xz file is downloaded from Debian, and the other two are downloaded from the GitHub release area. SHA-256 hashes are provided and verified for all three.
Once OBS has the three source-package files, it begins the build. If and when the build completes successfully, OBS will post the new binary packages for user download.

I wanted to put together a prototype of this, but unfortunately, my fork of this repo does not appear to have access to a GitHub runner. I do have a test build up on OBS, which was helpful in verifying how some aspects of the service work. And I have a similar process to the above already up and running for three browser packages on the XtraDeb PPA repo: regular chromium, firefox, and ungoogled-chromium.

Note that Debian has at least two separate tracks for chromium uploads: unstable/sid, and stable (currently bookworm). Until recently, oldstable (bullseye) was also supported. I think we should provide builds for all tracks supported by Debian, as different users may have one over the other installed.

This would mean, however, that it isn't enough for the ungoogled-chromium OBS project to have just an ungoogled-chromium-debian area; we would need e.g. ungoogled-chromium-debian-bookworm and ungoogled-chromium-debian-unstable. (Because the OBS project already provides the "ungoogled-chromium" context, I would also advise dropping that prefix from the names of the OBS package areas.)

Also, because the Raspbian package sources are identical to Debian's, adding builds for that distro from the same source is basically a freebie. (Debian itself has a half-dozen other architectures that could be enabled as well, though that should probably be driven by user demand. I'm not clear on the exact differences between Debian's armv7l support versus Rasbian's, but I figure most RaspberryPi users will want the latter.)

One additional wrinkle: It will be necessary to set up a subproject providing some build dependencies. At this time, the chromium package for stable lists rustc-web as a build-dep, yet that package does not appear in the Debian package indexes. And the reason for that is explained here: the package is currently being served from stable-proposed-updates, likely until the next point release.

Normally, a requirement like this should be covered by an additional repository path in the OBS Debian_12 download-on-demand area. But OBS currently offers only three options there: standard, update, and backports. None of which provide rustc-web at present.

I e-mailed the listed project maintainers about this, but have received no response. So our only alternative is to download and build rustc-web ourselves, as I did here. (If anyone knows of a better way to bring this issue to SUSE's attention, please let me know.)

Lastly, please note that I have not forgotten about Ubuntu support. It is trickier to address, however, because it involves a non-trivial Debian-to-Ubuntu conversion step that is outside the scope of the u-c project. So I will bring it up once the Debian side is sorted out.

Thoughts? @networkException @Eloston @PF4Public

PF4Public commented 5 months ago

@iskunk Your proposal is rather convoluted. Would it really be easy on the maintenance side? As far as I remember Debian was always behind in terms of updating to the latest chromium, perhaps version numbers might not align perfectly. All this might bring maintenance nightmares :( Have you attempted a kind of toy-implementation yourself?

iskunk commented 5 months ago

Hi @PF4Public,

Note that I have all of this already running since about February. The differences are that (1) it's running on my own server, (2) it handles more than just Chromium, and (3) it provides packages (including u-c) for XtraDeb. The point of this proposal is to set up a similar process for u-c here on a public runner, feeding the OBS builds.

Debian's releases do lag Google's by a bit, but that's unavoidable. I have not seen any issue with version misalignments... Debian obviously packages the Linux release rather than the Windows one, but u-c has always put out a tag matching the Linux release, so there hasn't been a problem. I do notice that sometimes, the Debian package comes out before the u-c tag, and more often it's the other way around, so the automation obviously needs to handle both scenarios. But that's not hard to deal with, when you know to expect it.

As for maintenance work, >90% of it has been in a step not even covered here: the Debian-to-Ubuntu conversion. That's non-trivial, and it often breaks with new major versions. But that's not in scope of the basic automation proposal here, because it's targeting Debian alone. Ubuntu support can easily be added later.

That aside, there hasn't been much. Sometimes, the u-c-debian/convert/ framework needs to be updated because of a conflict between the Debian and u-c patches, but that has come up very occasionally (last times were for 121.0.6167.85 and 119.0.6045.105, and we're now at 123.0.6312.105). Sometimes, the Debian tracker notification comes in corrupted, and fails the PGP check, but I have an infrequent polling check to cover any sort of e-mail failure. Sometimes, there's a bug in my automation scripts, but as time goes on, the number of those converges asymptotically to zero.

If you feel the above is more complicated than it needs to be, I'm open to counter-proposals, of course. But two important qualities that I feel should be retained are

That the system be fully automated, so that in the common/ideal case, u-c releases can go out the door without any human intervention. This not only reduces the maintainer burden, but allows new versions to be provided with as little turnaround time as possible. (This naturally implies having strong authentication in the early steps, to prevent a bad actor from fooling the automation into serving their ends. The security must be no worse than Debian's.)
Polling should be used as sparingly as possible. Not only does polling for a new release imply a worst-case latency up to the polling period, it uses the resources of a public project in a manner that can be avoided, and is thus not a friendly behavior if done frequently.

Also, separately from the discussion of how this should be implemented, I currently don't have a way of prototyping it. Is there a way that my fork of the repo can get access to a GitHub runner? (The GH docs indicate that self-hosted runners should only be used with private repos, since otherwise a malicious actor who creates a fork can run bad stuff on the self-hosted runner.)

PF4Public commented 5 months ago

If you feel the above is more complicated than it needs to be, I'm open to counter-proposals, of course.

Sorry if I sounded that way, I didn't mean it. Given that we don't have someone who permanently maintains Debian, your automation looks promising. You did a great job describing your idea, one thing was not very clear for me. What does it mean for ungoogled-chromium in terms of actions needed. Could you please give a quick overview of what needs to be done (in this repository in particular) step-by-step in order to implement your idea? I'm not very familiar with OBS, perhaps this could be the reason for your idea not being obvious for me.

iskunk commented 5 months ago

Thanks, I'm not a fan of making things any more complex than they need to be. And it's especially important here, since transparency is ultimately the whole point.

I was told that GitHub-hosted runners are freely available, and I checked the "Actions" section of my fork of the repo again. Maybe it wasn't there before, maybe I overlooked it, but I'm now seeing a "GitHub-hosted runners" tab. So that much unblocks my work for now; I can actually start putting together a prototype of this.

The other thing I need for now is just feedback on the proposal. In the absence of that, I can only presume that everyone's OK with the approach described above. This is the time for any objections and/or course corrections at the design level.

Once the prototype is complete, it should be a matter of (1) submitting the new scripts as a PR, and (2) having the u-c principals create and configure their own ProxiedMail account (or whatever service ends up getting used). Once both are in, everything should (in theory) be armed and ready for automated uploads.

PF4Public commented 5 months ago

I can actually start putting together a prototype of this.

That'd be great!

The other thing I need for now is just feedback on the proposal. In the absence of that, I can only presume that everyone's OK with the approach described above. This is the time for any objections and/or course corrections at the design level.

Exactly with that in mind it would be helpful to see an overview of your supposed workflow. It would ease identifying all the moving parts.

having the u-c principals create and configure their own ProxiedMail account (or whatever service ends up getting used)

I have my concerns here. Is there any other way for triggering apart from using mail? It would be much easier if your workflow could be adapted to be run in the existing infrastructure: GitHub Actions, OBS without involving anything extra. Perhaps a polling GitHub action could be an acceptable substitute for a mail service?

iskunk commented 5 months ago

Exactly with that in mind it would be helpful to see an overview of your supposed workflow. It would ease identifying all the moving parts.

The overview is up there; the implementation is just going to be an elaboration of that.

I have my concerns here. Is there any other way for triggering apart from using mail?

Debian does not provide any means of notification of new package uploads (that I'm aware of) apart from their tracker. It would be nice if they offered e.g. a webhook service.

Perhaps a polling GitHub action could be an acceptable substitute for a mail service?

I have polling as a fallback mechanism, for when the e-mail notification fails. But that would be something like every four hours. I could poll Debian more frequently, but it would be abusive to poll them, say, every five minutes. Anywhere in that range, you have a tension between (1) checking more frequently, reducing average turnaround time on updates, but using more resources, and (2) checking less frequently, being more "polite" and parsimonious, but updates (including urgent security updates) take longer to go out.

Is it better to have to deal with that tradeoff, than a new element of infrastructure?

Bear in mind, this has worked remarkably well in my production setup. For example, Debian recently released version 123.0.6312.105-2 of their sid/unstable Chromium package. I received the source-accepted e-mail from their tracker this past Sunday at 18:47 UTC. My automation kicked in, uploaded the Ubuntu-converted package to Launchpad, and I got the e-mail notice of Ubuntu's package acceptance (which comes in well after the upload is complete) at 19:27 UTC.

Note that, for non-security uploads, Debian's builds start after the acceptance notice goes out (I download the source package from their incoming server that is used by the build nodes, not the regular package repos). Meaning, my package started building while Debian's builds were in progress. That's the kind of rapid response that I aspired to, and that I am advocating here.

(It doesn't always happen, of course, but that's generally down to the Debian-to-Ubuntu conversion breaking. Then I have to go in, and fix it, and sometimes that takes more effort/time than others. It's a contigency for which no automated solution exists.)

PF4Public commented 5 months ago

Is it better to have to deal with that tradeoff, than a new element of infrastructure?

I'd say "yes". We already poll googleapis in main repo every hour without significant hurdles: https://github.com/ungoogled-software/ungoogled-chromium/blob/master/.github/workflows/new_version_check.yml

That's the kind of rapid response that I aspired to

That's understandable, but much longer delay would be also acceptable here.

iskunk commented 5 months ago

Polling Google is a whole different ballgame. Every five minutes would be fine then. It's an issue with Debian because they're a shoestring operation.

I can leave out the e-mail portion, but I don't understand the reluctance to making such an easy optimization.

PF4Public commented 4 months ago

I don't understand the reluctance to making such an easy optimization.

Mainly the bus factor that arises from the need to register at some third-party service.

iskunk commented 4 months ago

Mainly the bus factor that arises from the need to register at some third-party service.

Nothing requires that one person hang onto the login credentials. That could be stored as a secret, or maybe some kind of private area accessible only to org members. (I'm not clear on what GitHub provides in this vein.)

Anyway, I've been digging into the GitHub Workflows docs, and experimenting. (This is something I need to learn anyway for work, so it's a useful exercise.) There are a few problems with the polling approach particular to GitHub:

From their doc: "In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days."

This repo can easily go that long without being touched, given how the automation is being designed to minimize intervention. GitHub is clearly not amenable to the "set and forget" kind of cron job.
Speaking of cron, that is apparently the only syntax GitHub supports for scheduling the workflow. I'd want this to run (say) every three hours, and I don't care when within that three-hour window it runs, as long as over time the interval averages out. But because of the cron syntax, I can't express that margin of flexibility; I have to choose a specific minute when to run.

Many folks seem to have the problem of their scheduled workflows starting well after the specified time (or at all), and this is often a result of scheduling their workflows at the top of the hour, without realizing that it's a peak load period for exactly that reason. I could choose e.g. 17 minutes past the hour, and that should be an improvement. But is that really when the load is lowest? I have to guess, and if I guess wrong, the workflow might not run when we need it to.
Even though the poll operation is very small, it still requires spinning up a full-size runner that could do much more. There is no ubuntu-micro option or the like. Which only exacerbates the previous point.
And even though the poll operation is quick, taking only a couple seconds at most, GitHub rounds up the runner usage to the nearest minute. Those two seconds get "billed" as one minute. So that makes this approach artificially expensive, and as the linked issue shows, GitHub hasn't exactly been falling over themselves to address that.

iskunk commented 4 months ago

After further investigation, I've encountered some obstacles with the approach I had in mind:

The ProxiedMail service makes it possible for a received e-mail message to trigger a webhook. But the webhook payload format is fixed, intended for use with a endpoint customized to consume that format. (They provide sample code to integrate into "your Web application.") If you want to hook into an existing endpoint with its own defined format, like GitHub's repository_dispatch event, you're out of luck.
Zapier is an automation integration service, similar to IFTTT. You can put together an e-mail-to-webhook setup with them. But the webhook piece is considered premium functionality, not available in their free tier. The next tier up is US$20 per month.
IFTTT itself provides only a single global address for e-mail automation. Which automation to invoke is identified by the sender, not the recipient address. So this service cannot be driven by a third-party e-mail source.
I have not found any other online service that can provide the necessary e-mail-to-webhook functionality.

So it seems that a different approach is going to be needed.

The traditional solution to this problem is a Unix shell account that can receive mail (running in an always-on environment, obviously). Set up procmail(1) or a more modern equivalent, a cron job, and a bit of scripting, and that'll take care of this completely. I can provide the whole setup.

The question, then, is this: Do any core team members have a Linux/Unix shell account on which they are willing to host this monitoring function?

If no one does, then there are a few places online that will provide such accounts for free. On option I came across is freeshell.de. From their About page: "Our primary focus is on fostering anonymity, the free flow of information, and the promotion of freedom of speech." And obtaining an account only requires sending in a postcard.

(Note that it's not a big deal if someone who owns the operative shell account leaves the project for whatever reason. As long as someone else steps in, the GitHub endpoint can be triggered from a different source. We can always revoke the original auth token and issue a new one.)

Finally, as a remaining option, I could just hook in the new automation to my own Debian-monitoring system. It defeats the purpose of having an independent setup on the u-c side, but maybe that's not as big a deal as I think. (This automation would allow for a manual invocation in any event, so that will remain as a lowest-common-denominator option, even though in practice no one's going to want to do things that way.)

networkException commented 4 months ago

Perhaps a polling GitHub action could be an acceptable substitute for a mail service?

I would like to point out that we already use this in the main repository for notifying us about a new release, notably only every six hours.

If being up to date is really a concern it can run every 10 minutes for all I care, but generally (also for security releases) I refer to the exceptions Google sets in their own release blog "Stable channel will [...] roll out over the coming days/weeks"

From their doc: "In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days."

I assume the action will self commit version bumps? That should be enough

Even though the poll operation is very small, it still requires spinning up a full-size runner that could do much more. There is no ubuntu-micro option or the like. Which only exacerbates the previous point.

Yes this is the whole "We rewrote our backend to be a monolithic stack and it outperformed the previous ludicrous AWS pipelines / lambda functions setup" situation but honestly for GitHub us spawning another VM in their Azure cluster is just a rounding error.

Those two seconds get "billed" as one minute. So that makes this approach artificially expensive

Perhaps I'm missing something but to my knowledge is the policy has been (for a long while) and still is that open source projects (just public ones - source available I guess) don't pay for CI. Otherwise we couldn't afford any of this, there's nobody paying for the project (and accepting money would be dangerous given the current project name).

On the one hand that means that builds are obnoxiously slow, on the other hand spawning a VM and pulling some OCI image over their prebuilt ubuntu ISO doesn't hurt either.

I might have missed something crucial here (its 2:30am) but please consider using this approach. I have no issue with hosting long running processes on my personal infra (unless its chromium builds because I don't have an EPIC in a colocation) but not unless it can be avoided by using easily traceable publicly configured CI

iskunk commented 4 months ago

If being up to date is really a concern it can run every 10 minutes for all I care, but generally (also for security releases) I refer to the exceptions Google sets in their own release blog "Stable channel will [...] roll out over the coming days/weeks"

Choosing a polling period is often more of an art than a science, given the many considerations involved. But the way GitHub "bills" the time certainly pushes it to longer intervals. Which aren't ideal from a security standpoint.

I assume the action will self commit version bumps? That should be enough

There's nothing to commit for normal version updates. Nothing in the repo changes that is specific to a Chromium release (unless it's the occasional update to fix breakage, as I've already been doing).

but honestly for GitHub us spawning another VM in their Azure cluster is just a rounding error.

It's an issue if there's a lot of contention, hence the "street knowledge" of not running scheduled jobs at the top of the hour. While I don't like using more resources than necessary on principle, this can be a practical consequence of that.

Perhaps I'm missing something but to my knowledge is the policy has been (for a long while) and still is that open source projects (just public ones - source available I guess) don't pay for CI.

The CI is free, but it's not unlimited. (Huh, I thought I'd seen 6K minutes per month, but now I see 2K. That sucks.)

I have no issue with hosting long running processes on my personal infra (unless its chromium builds because I don't have an EPIC in a colocation) but not unless it can be avoided by using easily traceable publicly configured CI

Keep in mind the role that this portion of the automation plays in the big picture. All this does is start the GitHub workflow, at such a time when a new version is available. It's not providing any content that will affect the output of the build. So in the sense of tracing/auditing, there isn't really much there to look at. Now, obviously, the GitHub workflow proper that prepares the source is a different story. But this? This is equivalent to a team member monitoring the Debian notifications, and manually invoking the workflow when appropriate. What's actually going on behind the scenes doesn't matter, only the time that the call is received.

I can push the remote-Unix-side automation to a separate area of my branch, if you like. Note that all this would do is (1) receive e-mail from Debian's tracker, (2) run an infrequent cron job, (3) make intermittent HTTP requests to Debian's repos, and (4) make a REST call to GitHub with an auth token to kick off the workflow. Nothing compute- nor I/O-intensive is involved; rather, the critical bit is that this is running on a 24/7 system that can reliably receive mail (so no creaky old Linux box running in the corner of someone's flat).

networkException commented 4 months ago

The CI is free and minutes are not limited, all the pricing that pages lists is for private repos: "GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners"

Only the job (6h) / workflow (72h) limits apply, you can spawn as many workflows that run for just a minute as you want

iskunk commented 4 months ago

The docs don't do a good job of delineating between "free" and "unlimited," but I found a reference on the GitHub blog that clarifies "GitHub Actions are unlimited for public repositories and the Free plan also provides 2,000 minutes of Actions each month for use with your private projects." That explains how it's possible to build u-c on GitHub, even if it can't be done all in one go.

That makes the polling approach more feasible, but the other issues remain. Fundamentally, the problem is that a GitHub Action is just an awkward fit for this (monitoring). Ideally, you want a lightweight, always-on process, rather than a heavyweight, intermittent one.

I can have polling as a fallback measure, something like every 4-6 hours, so that updates will run even if external infrastructure goes offline. But in the normal course of things, it would be best for this repo's workflows to be started by external events, which GitHub doesn't natively support. Can we aim for that kind of hybrid approach, rather than polling alone?

networkException commented 4 months ago

Can we aim for that kind of hybrid approach, rather than polling alone?

sure, I don't want to block the automation on this discussion

SugaryHull commented 1 month ago

tracker.debian.org sends out a chromium source-upload notification to an e-mail address that we have previously subscribed on the site

@iskunk Would it be possible to subscribe to the debian-security mailing list as a means of checking if stable has a new release? I'm not sure how different that would be in terms of mail volume from the main Debian tracker.

iskunk commented 1 month ago

@iskunk Would it be possible to subscribe to the debian-security mailing list as a means of checking if stable has a new release? I'm not sure how different that would be in terms of mail volume from the main Debian tracker.

There's no need---the tracker subscription tells you about new uploads to stable-security as well as unstable. Here is the acceptance message for 127.0.6533.88-1~deb12u1 from Wednesday, for example.

If it's stable release builds you want, you'll get 'em. This automation will cover at least stable and unstable, as well as oldstable once chromium is being maintained for it again.

A word on progress: I have something working almost all the way, despite all the obstacles GitHub and OBS impose. The lack of a straightforward persistent disk volume on GitHub has been especially time-consuming, since Caches are a poor fit for the job and rewriting everything to use them added a lot of gratuitous complexity.

PF4Public commented 1 month ago

persistent disk volume on GitHub

Could you somehow use workflow artifacts for that?

iskunk commented 1 month ago

Could you somehow use workflow artifacts for that?

They're an even worse fit. Artifacts are great for when your workflow has some specifically associated output, like a compiled binary or log file. (I'm using them for the .dsc and .debian.tar.xz output files.) That's what the mechanism was made for. But here, what I need for this job is some state that persists across workflow runs, specifically

A small text file recording which package versions have been processed already;
A file-download cache, including the >900 MB orig source tarballs (since we really don't want to be downloading those more than once).

With artifacts or caches, the first thing you have to do is determine the specific artifact/cache that is the latest one (you don't care about older versions, after all). With caches, you have at least an easy way of doing this, using partial key matching; with artifacts, you have to run a query through the workflow runs. (Also, artifacts being directly associated with specific workflow runs is not helpful.)

Then you have to download the artifact/cache, read/write to your files as needed, then upload a new version of the cache or save a new artifact. Oh, and then you have to expire old versions of these artifacts/caches, lest you run over your project disk quota.

Compare all this song and dance to just having your project disk space available in some persistent volume mounted on the runner. You read, you write, you're done. The problem is fundamentally that artifacts and caches were designed for specific scenarios, and what I need to do is a different scenario that they don't address. And making my usage fit their model is incredibly awkward.

ungoogled-software / ungoogled-chromium-debian

Proposal for new u-c-debian automation #343