Open overlookmotel opened 7 months ago
Hey @overlookmotel, thanks for the kind words!
To reduce the build time for everyone, we will definitely look at what @Boshen did in https://github.com/oxc-project/oxc/pull/2343. This seems like something that could easily be implemented directly in this repo, so that everyone gets lower build time.
As far as sharding the benchmarks across multiple CI jobs, this is something that we will have to look into. At the moment CodSpeed only supports a single upload per run, so we would need to add some kind of sharded uploads per run or find a way to generate reports in multiple CI workers, then in a final job aggregate the reports and make a single upload.
The former solution would require a lot of rework on our app and take a lot of time. The later might be more hackable, since the changes would only need to be done on our open-source https://github.com/CodSpeedHQ/runner and https://github.com/CodSpeedHQ/action repositories
We've implemented sharding via a rather hacky method on https://github.com/oxc-project/oxc/pull/2751.
Now that can see it works, I'd like to "do it properly" and submit PRs against https://github.com/CodSpeedHQ/runner and https://github.com/CodSpeedHQ/action so everyone can benefit.
A few questions:
.out
filesThe .out
files are named <pid>.out
. Do the file names matter at your end?
Because each shard runs on a different machine, it's possible for 2 jobs to have same PID, and therefore to end up with 2 x 1234.pid
files. I had to add some logic for spotting duplicates and renaming them:
But maybe the filenames have no semantic meaning to your back end, so could more easily get unique filenames by prepending a shard ID on the start of each e.g. shard1-1234.out
, shard2-1234.out
?
Are the .log
files which are included in the profile archive required / useful to you? If they are, they'd need to be merged somehow, or included as runner_shard1.log
, runner_shard2.log
etc. Would your back end accept that?
Presumably the metadata JSON which is sent to CodSpeed's API can be generated on any job. For Github Actions at least, it looks like it only contains info which relates to the PR/branch, not the GH Actions job. Is that correct?
Implementation I had in mind:
Runner:
profile-directory
CLI switch to specify where the profile data gets written to. The action can read the .out
files from this directory.upload-only
CLI switch to skip running anything and just upload profile data from profile-directory
.Action:
shard
which receives a unique ID for a shard. If provided, the action runs codspeed-runner --profile-directory <whatever> --no-upload
and then saves content of the profile directory as an artefact.shard-finish
which downloads artefacts, combines them, and runs codspeed-runner --profile-directory <whatever> --upload-only
Does this sound OK to you?
Hey @overlookmotel, sorry for the delay in my response.
We talked about it internally and we want to directly support multiple uploads in our backend. This will mask a lot of the sharding logic in the action and the runner, and allow for easier integration with other providers.
Thanks a lot for the work you already did on this 🙏 and I am sorry that we do not continue down the single upload way. We are using your work as inspiration for the new specs.
I will update you when we have them, and if you are still willing, it will be a pleasure to have you contribute to the runner/action!
OK cool. That does sound like an ideal solution.
If I can add a couple of suggestions:
Some benchmarks (shards) will may take much less time to run than others (that's certainly the case with Oxc). It'd be ideal if results from completed shards were visible on CodSpeed as soon as they're uploaded, and the rest show as "Running..." - as opposed to have to wait for all the shards to finish running before you can see the results for any of them.
Oxc has multiple crates. Currently the benchmarks for crate X run even if the only changes in the PR are to crate Y (and X doesn't depend on Y). This is wasteful of CI resources, and I'd like to run only the benchmarks for the crates which could be affected by the code changes in the PR.
However, if you just prevent some benchmarks from running, then they show on Codspeed as "Missing" and it makes everything look red and bad!
To work around this, I have hacked together something to cache benchmarks results from a previous run and re-upload them (without actually running the benchmark again):
This isn't the right way to do it of course, but it'd be nice if CodSpeed could somehow handle this scenario where not all benchmarks run, but that is intentional rather than a problem.
1. Results on completion
Some benchmarks (shards) will may take much less time to run than others (that's certainly the case with Oxc). It'd be ideal if results from completed shards were visible on CodSpeed as soon as they're uploaded, and the rest show as "Running..." - as opposed to have to wait for all the shards to finish running before you can see the results for any of them.
Definitely, we plan to do incremented report updates. This is one of the upsides of allowing multiple uploads for a single run.
2. Handle missing benchmarks
Oxc has multiple crates. Currently the benchmarks for crate X run even if the only changes in the PR are to crate Y (and X doesn't depend on Y). This is wasteful of CI resources, and I'd like to run only the benchmarks for the crates which could be affected by the code changes in the PR.
We have thought about it in the past, but we could not find a way to implement this at that moment. How can we know if the non-uploaded benchmarks are dropped or deliberately not run, in a similar use case as you outlined? Handling missing benchmarks will require us to find a way to differentiate between the two cases.
How can we know if the non-uploaded benchmarks are dropped or deliberately not run, in a similar use case as you outlined?
Yes, that's true - the ambiguity is a problem. Maybe if the action allowed you to input that information. e.g. skipped: ["bench1", "bench2"]
?
I guess you will have this problem in some form regardless of my request. How can CodSpeed backend know when the run is done and it's time to generate the report comment? e.g.:
I suspect one of these will be required:
Either way, that'd be a good place to add a message "benchX is skipped, not dropped".
Hey just wanted to check if there's any updates to supporting sharded benchmarks? We just integrated CodSpeed into Daft: https://github.com/Eventual-Inc/Daft/pull/2696 and I think it'll be really useful!
Thanks for the interest @colin-ho, glad to have Daft using CodSpeed!
Sharding is still on our roadmap, we will update this issue when it is relevant.
OXC Javascript compiler is using CodSpeed for measuring performance. For a performance-focused project like OXC, CodSpeed is absolutely amazing.
However, running the benchmarks is currently by far the project's slowest CI job.
Would it be feasible to enable sharding the benchmarks across multiple CI jobs, to shorten both the build and execution time of each, and then a final job to pull the results together and upload to CodSpeed?