ci(performance): speed up CI - dynamic diff analysis

petermetz commented 1 year ago

Description

As a maintainer/contributor I want to have the CI finish in half an hour or less so that I'm not waiting 3 hours (or in some cases days) for the CI to finish running on my pull request.

Alternative solutions considered:

Auto-scaling self-hosted runners as a service (aka BuildJet) is expensive
Hosting our own (static, non-autoscaled) self-hosted runners has it's own issues where the runners get stuck or just go OOM and need manual hand-holding all the time according to Ry
Writing and deploying our own auto-scaling self hosted runners as a service thingy - sounds like a fun project, but definitely out of scope, too much work/time/risk...

Acceptance Criteria

Implement a custom script (./tools/... that populates the GitHub CI workflow action yaml context with data about the diff that can be leveraged with well crafted if conditions within the yaml files such that:

If the diff contains changes to a leaf package that no other package is depending on, then only that package is to be tested by the CI, everything else can be skipped.
If the changes are documentation only - no test execution happens at all
Supports the Typescript/NodeJS packages
Supports the Container image builds
Supports the newly added asset exchange tests
The speedup is such that a documentation change should have the CI finished in 5 minutes
The speedup is such that a code change in a leaf package should have the CI finished in about 15 minutes or less
If a top level package (such as the common or core-api packages) are being changed then the CI will still run for a long time because those packages will trigger the test execution for all other packages that depend on them and this cascades down all the way to the leaf packages.

jagpreetsinghsasan commented 1 year ago

Currently I am using the js-dependency-extractor to generate the dependency graph. Also I had to hard code the test-tooling -> ghcr job mapping (as there is no pattern on either sides)

jagpreetsinghsasan commented 1 year ago

@petermetz suggestion on this

Yeah, what we need to do is attach some parseable metadata to the job definitions themselves to decouple the job name from the package name (so that they don't have to match). So that the test-tooling job can be called anything, but then the YAML object that defines it has some key like x-pkg-name or something that we parse and then identify/associate the job with the correct package no matter what. This will be needed when we further optimize the CI later on where some packages I will want to break up into multiple jobs to make the test execution more parallelized. A good example of this is the fabric and the corda connectors. Their test cases take almost an hour to run so we have to have multiple jobs for those so that the test cases can run much faster but then we won't be able to use the same job name for these jobs as the package name because it will have to be something unique.

petermetz commented 10 months ago

Note to self: There's also this released by GitHub in the meantime(?) => https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#onpushpull_requestpull_request_targetpathspaths-ignore

hyperledger / cacti