hyperledger / besu

An enterprise-grade Java-based, Apache 2.0 licensed Ethereum client https://wiki.hyperledger.org/display/besu
https://www.hyperledger.org/projects/besu
Apache License 2.0
1.47k stars 798 forks source link

CICD Improvements #5791

Open jflo opened 1 year ago

jflo commented 1 year ago

In light of the errors in build process with the 23.7.1 release, I would like to revisit our CICD process and propose some improvements.

If changes adopted, we should also review the default PR comment included for new PRs, which guides users toward crafting a good pr that will likely not fail these later checks.

garyschulte commented 1 year ago

Do we need CI to produce SNAPSHOT versions of jar artifacts? As far as I know these are not re-used elsewhere in build process, and are regularly overwritten by whatever PR was most recently built. Is there enough value here to rebuild and republish these to jfrog as frequently as we are? I suspect not. Same question with regard to docker images/manifests.

As recently as today, Linea has used snapshot builds both from jfrog maven and from docker to end-run around release issues and/or delays. I think there is value in continuing to build and publish those images, even if they are volatile

jflo commented 1 year ago

As recently as today, Linea has used snapshot builds both from jfrog maven and from docker to end-run around release issues and/or delays. I think there is value in continuing to build and publish those images, even if they are volatile

Would a nightly build have been sufficient, or did they need more recent changes than that?

jflo commented 1 year ago

Would love to hear thoughts from anyone that has developed an opinion on the new(ish) Merge Queue mechanic in github, and how they see it possibly fitting into a new CICD/PR review flow.

siladu commented 1 year ago

Draft PR workflow

Like the idea of a draft PR workflow, but I'd want the option to run the tests before converting to ready for review. I'd say avoiding tests is the exceptional case though, I often use drafts to run the ATs as a convenience, so maybe a manual way to skip tests would be better.

Not a fan of using a comment, would prefer to handle via github if possible.

Would a nightly build have been sufficient, or did they need more recent changes than that?

I think a nightly publish would be sufficient. There's definitely value in having our nightly canaries running with the latest changes. When testing PRs on infra, we also need a way to get that code onto the infra. Maybe a manual publish step from the PR could work. Failing that, there's always just using git/gradle on the box but via CI would be nice :)

Would love to hear thoughts from anyone that has developed an opinion on the new(ish) Merge Queue mechanic in github, and how they see it possibly fitting into a new CICD/PR review flow.

re: Merge Queue, there were a couple of missing features that were essentially blockers for us IIRC, e.g. not being able to edit the description. Doesn't look like it's been added yet but might be worth another look since it's GA now: https://github.com/orgs/community/discussions/46757

I haven't thought about it much, but how do you foresee the Queue helping with the CI problems you've mentioned?

Note, similar discussions are being had here: https://wiki.hyperledger.org/display/BESU/Proposal%3A+Quarterly+releases+from+main+by+default?focusedCommentId=98731846#comment-98731846

macfarla commented 1 year ago

Merge queue - I really like the idea in theory - there are times when it would be really handy - but in practice yes it was still a bit green when we tried it.

I have seen with GHA when jobs queue up, that there can be multiple duplicate tasks for the same PR (eg before and after a merge from main update) - but I'm assuming GHA can be configured for newer job requests to cancel older ones.

macfarla commented 1 year ago

Do we need CI to produce SNAPSHOT versions of jar artifacts? As far as I know these are not re-used elsewhere in build process, and are regularly overwritten by whatever PR was most recently built. Is there enough value here to rebuild and republish these to jfrog as frequently as we are? I suspect not. Same question with regard to docker images/manifests.

As recently as today, Linea has used snapshot builds both from jfrog maven and from docker to end-run around release issues and/or delays. I think there is value in continuing to build and publish those images, even if they are volatile

we also use these builds to spin up on-demand nodes to performance test etc for specific PRs - but that's not needed for every single PR

macfarla commented 1 year ago

At a higher level, what's the main driver for these improvements - eg do we want to optimise for developer/reviewer time, CI spend, CI time? Something else, or some combination?