WebAssembly / spec

WebAssembly specification, reference interpreter, and test suite.
https://webassembly.github.io/spec/
Other
3.13k stars 445 forks source link

[build] Update makefiles and CI config to enable W3C autopublishing #1730

Closed sideshowbarker closed 7 months ago

sideshowbarker commented 7 months ago

Short answer:

Longer answer > To reduce turnaround time and save resources, wouldn't it be better to move the new build steps to the existing build targets, so that it reuses their work? So the problem there is: Bikeshed needs to be run twice on each of the specs anyway — at least if we want the https://webassembly.github.io/spec/ Bikeshed versions to retain the “Editor’s Draft” formatting they have now, rather than instead having the formatting of those change to being the same “Working Draft” formatting that the https://www.w3.org/TR versions have. If the editors and the groups chairs are were OK with instead changing the https://webassembly.github.io/spec/ versions to have the same “Working Draft” formatting that the https://www.w3.org/TR/ versions have, then we’d need to run Bikeshed only once on each spec. I’d personally be fine with the https://webassembly.github.io/spec/ versions having the “Working Draft” formatting — and as far as I’m aware, W3C policy doesn’t explicitly prohibit us from doing so. That said, I think that in general, W3C management would prefer that the “Working Draft” formatting only be used for the versions published at https://www.w3.org/TR/ URLs. But I think we could still try it if we wanted to, and see if anybody notices and asks us to stop doing it (in which case, we’d just flip it back). But otherwise, to retain the “Editor’s Draft” formatting of the https://webassembly.github.io/spec/ versions means that we have to first run `bikeshed` once as it’s already being run — with `Status: ED` set in the `.bs` sources for each spec — and then to get the “Working Draft” formatting for the https://www.w3.org/TR/ versions, we need to run `bikeshed` again, but with the `Status` property instead set to `WD`. > My only concern is that the new publish-to-w3c action redoes everything from scratch. Installing the prerequisite software and doing the Sphinx build already takes significant time on CI. As far as the time requirements, one thing to note is: the `publish-to-w3c-TR` CI job this PR adds is run in parallel with the existing CI jobs that do the builds and publishing for the existing https://webassembly.github.io/spec/ versions. And from watching all the jobs run for this PR branch, I can see that `publish-to-w3c-TR` job finishes are almost exactly the same time — roughly 6 minutes and 30 seconds — as the `build-core-spec`+`publish-spec` combination. In other words, the existing CI takes about 6.5 minutes to complete, and the patch in this PR doesn’t increase that. All that said, as far as resource usage, it’s true that the current patch in this PR as currently written does basically doubles the resource usage — and I guess there are few changes that could be made to decrease that. As far as exactly what changes could be made: The opportunities to make changes to the makefiles to reduce the resource usage of the build itself are very few — because: - to get W3C autopublishing working, we’ll currently need to run `bikeshed` twice for each spec, regardless - the `bikeshed` target in the `core/Makefile` is what consumes most of the time and resources, and we must run it twice However, I guess we could factor the Sphinx part of that `bikeshed` target into its own target — in which case, we could make the build run that part just once, only before the first time we run the Bikeshed part — and not re-run it a second time. That’d probably cut 30 seconds out of the 6 minutes and 30 seconds the `publish-to-w3c-TR` job takes. So I’ll try it. But the bigger savings would probably come from making changes to the `workflows/ci-spec.yml` file, to factor out the installing-the-prerequisite-software steps — `Checkout repo`, `Setup Node.js`, `Setup Bikeshed`, and `Setup Sphinx` —steps into their own `prep` job. And then we make the `build-core-spec`, `build-js-api-spec`, `build-web-api-spec`, and `publish-to-w3c-TR` jobs all require/depend on that `prep` job, so that it runs first, just once. So I’ll try that first — both because it’s quicker and easier to try then refactoring the `bikeshed` makefile target is, and also because it should give a more-significant reduction in the overall resource usage (though not in the overall build time).
sideshowbarker commented 7 months ago

But the bigger savings would probably come from making changes to the workflows/ci-spec.yml file, to factor out the installing-the-prerequisite-software steps — Checkout repo, Setup Node.js, Setup Bikeshed, and Setup Sphinx —steps into their own prep job. And then we make the build-core-spec, build-js-api-spec, build-web-api-spec, and publish-to-w3c-TR jobs all require/depend on that prep job, so that it runs first, just once.

OK, I tried that in https://github.com/WebAssembly/spec/pull/1730/commits/ddb1751eab662de0e555a9a97e3ead6b4f28cdaa but quickly discovered it doesn’t work — and won’t work. And for the record here, the reason is: each job runs on a different, new machine/runner. So there’s no way to preserve shared state/environment among jobs.

Apparently this is a fundamental characteristic of GitHub Actions that people generally know or are expected to know…

sideshowbarker commented 7 months ago

I guess we could factor the Sphinx part of that bikeshed target into its own target — in which case, we could make the build run that part just once, only before the first time we run the Bikeshed part — and not re-run it a second time.

So, given that for CI we can’t share any environment among jobs, that makefile change wouldn’t actually buy us anything useful for reducing CI resource usage. It’d instead just help someone locally running the W3C-echidna target. And in practice I’m the only person who’d ever being doing that. And so, seems to be no real value to doing that refactoring.

sideshowbarker commented 7 months ago

Yes, the biggest time & resource sink in CI is setting up the prerequisite software. But you can't share that across actions, you can only share artefacts via "upload".

Yeah, I’d vaguely recalled that being the case, but it didn’t sink back in until today when I tried the ci-spec.yml refactoring.

That's why I suggested just moving the new build steps into the existing build jobs and upload the results for the publication job, like with publish-spec.

But don't worry about it, I can have a stab at refactoring that later.

OK. When you try it, at https://github.com/w3c/echidna/wiki/How-to-use-Echidna#tar-file you can find the current documentation on how the W3C publishing HTTP API works. But essentially it’s just going to be this:

curl 'https://labs.w3.org/echidna/api/request' -F "tar=@WD.tar" -F "token=<token>" -F "decision=<URL>"

…where:

So, I assume what the refactoring you have in mind would need to amount to is: the CI config would be changed to construct the necessary tar files (WD.tar or whatever) from the same spec artifacts already being downloaded by the publish-spec job, and then the calls to the API endpoint get made with those tarred-from-downloaded-artifacts tar files.

FWIW, I'd actually be fine just having the WD version on the github.io page.

OK — I guess the first step for that would just be changing the existing .bs files to have Status: WD.

In fact, that's probably less confusing to users than seeing two differently looking documents on both sites and having to wonder how they relate to each other.

Yeah, that’s how it seems to me as well.

Btw, please never force-push to a branch under review

OK

rossberg commented 7 months ago

@sideshowbarker, is there anything missing for merging this?

sideshowbarker commented 7 months ago

@sideshowbarker, is there anything missing for merging this?

Nope, nothing missing — it’s ready to be merged

rossberg commented 7 months ago

@sideshowbarker, today CI failed on the publish-to-w3c target for an unrelated PR:

https://github.com/WebAssembly/spec/actions/runs/8133065435/job/22233472624?pr=1737

Do you have an idea what the problem could be?

sideshowbarker commented 7 months ago

https://github.com/WebAssembly/spec/actions/runs/8133065435/job/22233472624?pr=1737#step:7:3005 says:

Must provide W3C_ECHIDNA_TOKEN_CORE and DECISION_URL environment variables

…but at https://github.com/WebAssembly/spec/blob/main/.github/workflows/ci-spec.yml#L124 , the CI build sets W3C_ECHIDNA_TOKEN_CORE — and in the makefile here: https://github.com/WebAssembly/spec/blob/89f7ced5e149cbdc3a6f2d661b0e5f7da0b45925/document/core/Makefile#L12

DECISION_URL is hardcoded.

And this was working on the PR branch previously, so it still should be.

But I’ll investigate it further, and see what could be causing it to not find what it needs.

sideshowbarker commented 7 months ago

OK, first clue: Looking at https://pipelinesghubeus4.actions.githubusercontent.com/k4KMVVeRi9EEw2vuQiOaD2iDjWSjf8j5KDSnrjfdb62mXBfQf1/_apis/pipelines/1/runs/1014/signedlogcontent/2?urlExpires=2024-03-04T12%3A23%3A05.0263402Z&urlSigningMethod=HMACV1&urlSignature=UGaJewwi8Aq1yCtlo9EA1LTL5QbS246pUMl61Bjn2RI%3D raw log for the https://github.com/WebAssembly/spec/pull/1730 PR branch before we merged, I see:

2024-02-19T06:32:52.0833801Z env:
2024-02-19T06:32:52.0834120Z   STATUS: --md-status=WD
2024-02-19T06:32:52.0834795Z   W3C_ECHIDNA_TOKEN_CORE: ***
2024-02-19T06:32:52.0835312Z   W3C_ECHIDNA_TOKEN_JSAPI: ***
2024-02-19T06:32:52.0835787Z   W3C_ECHIDNA_TOKEN_WEBAPI: ***
2024-02-19T06:32:52.0836301Z   YARN_ENABLE_IMMUTABLE_INSTALLS: false

…that is, W3C_ECHIDNA_TOKEN_CORE: *** — three asterisks for the (secret) value.

But looking at the https://productionresultssa7.blob.core.windows.net/actions-results/30af3ce5-e11e-4563-9de1-5e3353d34795/workflow-job-run-fc640d00-b954-5c7a-fd14-93d46735bd61/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-03-04T12%3A35%3A20Z&sig=ukrXt2zBkNU6erLQB9OnzbzDiyLKl9l1APnloNKODeo%3D&sp=r&spr=https&sr=b&st=2024-03-04T12%3A25%3A15Z&sv=2021-12-02 raw log for the https://github.com/WebAssembly/spec/pull/1737 PR branch, I instead see:

2024-03-04T06:56:22.6245453Z env:
2024-03-04T06:56:22.6245790Z   STATUS: --md-status=WD
2024-03-04T06:56:22.6246276Z   W3C_ECHIDNA_TOKEN_CORE: 
2024-03-04T06:56:22.6246642Z   W3C_ECHIDNA_TOKEN_JSAPI: 
2024-03-04T06:56:22.6247056Z   W3C_ECHIDNA_TOKEN_WEBAPI: 
2024-03-04T06:56:22.6247539Z   YARN_ENABLE_IMMUTABLE_INSTALLS: false

…that is, no asterisks.

So, apparently GH Actions isn’t retrieving the W3C_ECHIDNA_TOKEN_CORE value as expected. But looking at the Repository secrets section of https://github.com/WebAssembly/spec/settings/secrets/actions, I can see that the W3C_ECHIDNA_TOKEN_CORE secret is still set. So it’s unclear why the CI build’s not finding it.

sideshowbarker commented 7 months ago

OK, looking at https://github.com/WebAssembly/spec/settings/secrets/actions, I see these statements:

Anyone with collaborator access to this repository can use these secrets and variables for actions. They are not passed to workflows that are triggered by a pull request from a fork.

So, because the https://github.com/WebAssembly/spec/pull/1737 PR is from a fork, I guess it’s expected behavior that GH Actions for that PR won’t have access to the repository secrets.

If so, I think there’s no way to expose repo secrets to GH Actions for PRs from forks — it seems to be prevented by design.

Given that, I guess it means that we could only auto-publish to W3C TR for PRs from branches in the repo itself — which requires the authors to have commit/push perms for this repo — and for merges to the main branch.

rossberg commented 7 months ago

Thanks for looking into it!

Now that you mention it, in fact PRs should probably never cause a push to the w3c repo, only actual pushes to the main branch should. That probably means that we'll have to split the publish-to-w3c job into a separate workflow file without the on: pull_request trigger. Would that fix the problem as well?

sideshowbarker commented 7 months ago

in fact PRs should probably never cause a push to the w3c repo, only actual pushes to the main branch should.

OK

That probably means that we'll have to split the publish-to-w3c job into a separate workflow file without the on: pull_request trigger. Would that fix the problem as well?

Yeah, it seems like it would

sideshowbarker commented 7 months ago

split the publish-to-w3c job into a separate workflow file without the on: pull_request trigger.

OK, opened https://github.com/WebAssembly/spec/pull/1738 for that