thomaseizinger commented 1 year ago

Describe your use case

We are using this in CI and it contributes about ~1-2min to the CI run of each crate despite:

cargo semver-checks being installed
cargo doc being cached

See https://github.com/thomaseizinger/rust-libp2p/actions/runs/3653859145/jobs/6173748557 for an example.

Is there a way for how this can be improved? When I run it locally, it is a lot faster (after the first invocation) so I am guessing there is a cache somewhere? Which files do I need to cache?

Describe the solution you'd like

cargo semver-checks to finish within seconds, not minutes.

Alternatives, if applicable

No response

Additional Context

No response

obi1kenobi commented 1 year ago

I'm not familiar with your setup for caching cargo doc — does it work for only the current state of the package, or does it also cache the cargo doc run for the baseline (crates.io) version as well? Based on the log, it seems like the actual checking of the rustdoc files finished in 0.044s, so the slow parts are building one or both of the rustdoc JSON files.

208 should merge very soon, and might make it easier to not have to cache the `cargo install` step.

Since the baseline version isn't changing all that often (especially not with every PR), it should be possible to cache the baseline rustdoc JSON file across runs. This is how the GitHub Action builds that file, and in principle it could be improved to cache the file as well. cargo-semver-checks can be directed to use a pre-generated baseline rustdoc JSON file with the --baseline-rustdoc <file> option.

After running cargo check --all-features or cargo test --all-features, building the rustdoc JSON for the current version should be quite fast since everything is already mostly checked and built. So running cargo-semver-checks in the same runner where the tests already ran would probably make it much faster.

In general, I believe your cargo-semver-checks setup is the most sophisticated one in existence (at least that I know of). If you're interested, I'd be thrilled to work with you on building all these capabilities into the cargo-semver-checks GitHub Action itself rather than each of us maintaining a separate workflow.

thomaseizinger commented 1 year ago

My workflow doesn't build it for the baseline version, no. I thought that one was downloaded from docs.rs or something?

Does it build the baseline version also locally?

I am happy to collaborate on an/the action. Fitting that one out with some pre-configured caching would be really nice actually :)

208 is nice to see!

obi1kenobi commented 1 year ago

docs.rs doesn't yet host the JSON, so for now the baseline is built locally (based on the registry, a gitrev, or a filesystem path) unless the baseline JSON is explicitly provided via the --baseline-rustdoc flag.

This is what the action does at the moment: https://github.com/obi1kenobi/cargo-semver-checks-action/blob/main/action.yml#L29

In a v2 of the action, we'd want to remove the version-selection logic from the action (since it incorrectly selects pre-releases, unlike the logic already present in cargo-semver-checks) and make cargo-semver-checks aware of the possibility of the baseline rustdoc JSON already being generated so it only rebuilds when necessary.

Perhaps something like a --baseline-cache <PATH> option, where cargo-semver-checks will move built baseline JSON files into that directory, or use existing files if they are there?

What do you think? Also, looping in @epage in case he has ideas.

thomaseizinger commented 1 year ago

make cargo-semver-checks aware of the possibility of the baseline rustdoc JSON already being generated so it only rebuilds when necessary.

I think this is the MVP. As long as it is somewhere in the target directory, CI caching should pick that up. Configuration options to control the path can come later once someone asks for it!

thomaseizinger commented 1 year ago

I just saw that there is already target/semver-checks but that directory is 21GB after a cargo clean and running cargo semver-checks check-release --workspace. We can't cache that :sweat_smile:

thomaseizinger commented 1 year ago

Perhaps something like a --baseline-cache <PATH> option, where cargo-semver-checks will move built baseline JSON files into that directory, or use existing files if they are there?

Just separating them from the current compilation output would already be good. That would allow us to clean that target directory of artifacts and cache it. These should never need to change so unless someone messes with the target directory, I don't think there should be any bugs.

obi1kenobi commented 1 year ago

I just saw that there is already target/semver-checks but that directory is 21GB after a cargo clean and running cargo semver-checks check-release --workspace. We can't cache that 😅

I'm a bit disk-space-limited on the machine I have with me right now, so if you don't mind, I'd love to ask you to check something in that target/semver-checks directory:

Ignore (or delete) the target/semver-checks/target directory entirely.
All the other directories in target/semver-checks should be prefixed with registry, right? Those are registry (crates.io) based baseline builds, which each contain both the usual compilation artifacts (debug directory) and the rustdoc JSON doc directory.
How big are all the JSON files from those doc directories? That's the only part we absolutely need, and we can avoid caching everything else.

The cache would be compressed with zstd in GitHub Actions, so it would probably be reduced to around one-third or one-fourth the raw JSON files' size.

Just separating them from the current compilation output would already be good. That would allow us to clean that target directory of artifacts and cache it.

We can put the cache somewhere like target/semver-checks/cache. How does that sound?

thomaseizinger commented 1 year ago

How big are all the JSON files from those doc directories? That's the only part we absolutely need, and we can avoid caching everything else.

> stat -c %s target/semver-checks/**/*.json | paste -sd+ | bc
110568535

That is a 110MB!

thomaseizinger commented 1 year ago

Just separating them from the current compilation output would already be good. That would allow us to clean that target directory of artifacts and cache it.

We can put the cache somewhere like target/semver-checks/cache. How does that sound?

Sounds good to me. I guess to make this really effective, cargo semver-checks would have to move the JSON files into that cache directory itself and clean the target directory afterwards otherwise we have a manual step again of cleaning those target directories.

obi1kenobi commented 1 year ago

That is a 110MB!

Whew, I'm glad it's that little :) It's also JSON so it should be highly compressible with zstd which GitHub Actions will use by default.

to make this really effective, cargo semver-checks would have to move the JSON files into that cache directory itself and clean the target directory afterwards

I think cargo-semver-checks should move the JSON files into the cache directory, but I think we should leave the cleaning to its GitHub Action I don't think that cargo-semver-checks itself should clean the target directory, since that might be undesirable for local use.

Let me know if you have any concerns!

thomaseizinger commented 1 year ago

That is a 110MB!

Whew, I'm glad it's that little :) It's also JSON so it should be highly compressible with zstd which GitHub Actions will use by default.

I'll test it locally!

to make this really effective, cargo semver-checks would have to move the JSON files into that cache directory itself and clean the target directory afterwards

I think cargo-semver-checks should move the JSON files into the cache directory, but I think we should leave the cleaning to its GitHub Action I don't think that cargo-semver-checks itself should clean the target directory, since that might be undesirable for local use.

Let me know if you have any concerns!

Hmm, I think that is inconsistent. We are creating those JSON files locally because we can't download from docs.rs.

The fact that we need to build them is an implementation detail IMO. Using as little disk space as possible is also desirable locally, especially when we are talking as much as 22GB for a single repository!

Alternatively, we can also say that the program is completely ignorant of where the data is stored and the action moves around the JSON files and passes them in as baseline etc

thomaseizinger commented 1 year ago

That is a 110MB!

Whew, I'm glad it's that little :) It's also JSON so it should be highly compressible with zstd which GitHub Actions will use by default.

I'll test it locally!

Default zstd yields an archive 13 MB in size :)

thomaseizinger commented 1 year ago

to make this really effective, cargo semver-checks would have to move the JSON files into that cache directory itself and clean the target directory afterwards

I think cargo-semver-checks should move the JSON files into the cache directory, but I think we should leave the cleaning to its GitHub Action I don't think that cargo-semver-checks itself should clean the target directory, since that might be undesirable for local use. Let me know if you have any concerns!

Hmm, I think that is inconsistent. We are creating those JSON files locally because we can't download from docs.rs.

The fact that we need to build them is an implementation detail IMO. Using as little disk space as possible is also desirable locally, especially when we are talking as much as 22GB for a single repository!

Alternatively, we can also say that the program is completely ignorant of where the data is stored and the action moves around the JSON files and passes them in as baseline etc

I am happy to have a go at implementing this if you agree that it should happen within cargo semver-checks. It currently contributes to about 20% of our CI time which I am keen to optimize :)

obi1kenobi commented 1 year ago

That'd be great! I think it's reasonable to handle caching here, and it's also fine to delete the baseline build data from the target dir if caching the output. I'd also be open to adding zstd compression for the cached baseline JSON file if you think that's worth it independently, though perhaps that should be a separate addition just to keep things simple.

How can I help? I'm happy to review draft PRs or whatever else you need.

thomaseizinger commented 1 year ago

I'd also be open to adding zstd compression for the cached baseline JSON file if you think that's worth it independently

I am leaning towards that being separate. Compared to other build output, the JSON files are small meaning there isn't much to be gained. The only environment that would benefit from that is local builds and I am not sure how frequent local use is compared to CI runs?

thomaseizinger commented 1 year ago

How can I help? I'm happy to review draft PRs or whatever else you need.

I'll open a draft PR once I have some questions and/or an initial design, thanks :)

thomaseizinger commented 1 year ago

Having seen the internals now, I have a question: Why are we building the docs for the current revision in a separate directory? Why not just reuse the default location? That would allow users to automatically benefit from caching. We already cache the debug build for our crates, having cargo semver-checks use the same location for the debug build of its docs would avoid unnecesary rebuilds, wouldn't it?

epage commented 1 year ago

It'd be something to try. Unsure how well it handles any overhead from the user doing different feature sets than us.

obi1kenobi commented 1 year ago

Here's a specific example scenario to illustrate the concern that @epage is pointing toward. In principle, cargo-semver-checks may need to change the RUSTFLAGS to resolve situations like this: https://github.com/obi1kenobi/cargo-semver-checks-action/issues/17

In that case, I believe that will both force a full rebuild and wipe the default location's cached data, which would probably make the user quite unhappy.

Not saying we couldn't work around this if needed. Just saying that applying some caution is justified :)

thomaseizinger commented 1 year ago

Right, that makes sense. I'll see if I can optimise that on my end by just adding another cache!

obi1kenobi commented 1 year ago

I spent today working on a new optimizations API in Trustfall, and while it still needs a bit of polish, I wanted to give you a sneak peek because it's relevant to the objective of making CI faster 😄

The numbers below are the result of implementing just one optimization using the new API. I've intentionally picked some gigantic crates where the perf impact is most obvious — smaller crates will benefit proportionally less, but still noticeably.

crate `aws-sdk-s3`
before:
PASS [  93.454s]       major        auto_trait_impl_removed

after:
PASS [   0.198s]       major        auto_trait_impl_removed

speedup: 472x

crate `aws-sdk-ec2`
before:
PASS [2421.900s]       major        auto_trait_impl_removed

after:
PASS [   1.254s]       major        auto_trait_impl_removed

speedup: 1931x

Needless to say, I'm excited 🤩

thomaseizinger commented 1 year ago

Damn, that is amazing! :)

Looking at our current CI, I just had another idea. At the moment, we are creating and parsing two rustdocs, only to then say 0 checks necessary because we are already doing a minor bump which allows breaking changes.

We could reverse that and avoid parsing the rustdoc altogether?

obi1kenobi commented 1 year ago

The version numbers are read from the rustdoc files, which are the ultimate source of truth since it's what Rust and cargo saw and generated. So I don't think that would work, and while I'm sure it could be "wedged in," it doesn't sound like it would be worth the trouble.

obi1kenobi commented 1 year ago

I added one more optimization just to play with the new API, and I'm pretty happy with both the ergonomics and the outcomes: on the biggest crate I could find, the most expensive lint took ~1.3s instead of 4887s before (~3759x speedup).

Here's a full perf report. I used the aws-sdk-ec2 crate (~200MB rustdoc JSON for each of baseline and current). The baseline and current rustdocs each take 30-35s to generate on my computer. The queries are all unchanged — the only changes are in trustfall_core (the new optimizations API draft) and in trustfall-rustdoc-adapter (adding indexes and applying them to queries).

Before (cargo-semver-checks v0.17): 18178s lint wall-clock time (i.e. over 5h)

``` Checking aws-sdk-ec2 v0.24.0 -> v0.24.0 (no change) Starting 40 checks, 0 unnecessary PASS [2421.900s] major auto_trait_impl_removed PASS [ 0.072s] major constructible_struct_adds_field PASS [ 681.349s] major constructible_struct_adds_private_field PASS [ 3.765s] major constructible_struct_changed_type PASS [1547.662s] major derive_trait_impl_removed PASS [ 53.474s] major enum_marked_non_exhaustive PASS [ 35.733s] major enum_missing PASS [ 0.054s] minor enum_must_use_added PASS [ 0.071s] major enum_repr_c_removed PASS [ 0.055s] major enum_repr_int_changed PASS [ 0.051s] major enum_repr_int_removed PASS [ 0.058s] major enum_struct_variant_field_added PASS [ 0.057s] major enum_struct_variant_field_missing PASS [ 0.050s] major enum_variant_added PASS [ 170.286s] major enum_variant_missing PASS [ 0.096s] major function_const_removed PASS [ 0.160s] major function_missing PASS [ 0.133s] minor function_must_use_added PASS [ 0.089s] major function_parameter_count_changed PASS [ 0.099s] major function_unsafe_added PASS [ 25.570s] major inherent_method_const_removed PASS [3255.975s] major inherent_method_missing PASS [ 0.478s] minor inherent_method_must_use_added PASS [3145.380s] major inherent_method_unsafe_added PASS [4887.539s] major method_parameter_count_changed PASS [ 514.508s] major sized_impl_removed PASS [ 157.129s] major struct_marked_non_exhaustive PASS [ 265.372s] major struct_missing PASS [ 0.067s] minor struct_must_use_added PASS [ 563.758s] major struct_pub_field_missing PASS [ 0.052s] major struct_repr_c_removed PASS [ 0.053s] major struct_repr_transparent_removed PASS [ 290.168s] major struct_with_pub_fields_changed_type PASS [ 0.046s] major trait_missing PASS [ 0.048s] minor trait_must_use_added PASS [ 0.048s] major trait_unsafe_added PASS [ 0.060s] major trait_unsafe_removed PASS [ 0.057s] major tuple_struct_to_plain_struct PASS [ 0.055s] major unit_struct_changed_kind PASS [ 156.533s] major variant_marked_non_exhaustive Completed [18178.109s] 40 checks; 40 passed, 0 unnecessary ```

Added "item name -> item id" index: 12.7s lint wall-clock time

``` Checking aws-sdk-ec2 v0.24.0 -> v0.24.0 (no change) Starting 40 checks, 0 unnecessary PASS [ 0.959s] major auto_trait_impl_removed PASS [ 0.082s] major constructible_struct_adds_field PASS [ 0.279s] major constructible_struct_adds_private_field PASS [ 0.084s] major constructible_struct_changed_type PASS [ 1.426s] major derive_trait_impl_removed PASS [ 0.101s] major enum_marked_non_exhaustive PASS [ 0.090s] major enum_missing PASS [ 0.067s] minor enum_must_use_added PASS [ 0.070s] major enum_repr_c_removed PASS [ 0.065s] major enum_repr_int_changed PASS [ 0.063s] major enum_repr_int_removed PASS [ 0.068s] major enum_struct_variant_field_added PASS [ 0.082s] major enum_struct_variant_field_missing PASS [ 0.068s] major enum_variant_added PASS [ 0.193s] major enum_variant_missing PASS [ 0.076s] major function_const_removed PASS [ 0.098s] major function_missing PASS [ 0.095s] minor function_must_use_added PASS [ 0.093s] major function_parameter_count_changed PASS [ 0.101s] major function_unsafe_added PASS [ 0.195s] major inherent_method_const_removed PASS [ 1.012s] major inherent_method_missing PASS [ 0.188s] minor inherent_method_must_use_added PASS [ 2.524s] major inherent_method_unsafe_added PASS [ 2.767s] major method_parameter_count_changed PASS [ 0.401s] major sized_impl_removed PASS [ 0.111s] major struct_marked_non_exhaustive PASS [ 0.161s] major struct_missing PASS [ 0.065s] minor struct_must_use_added PASS [ 0.192s] major struct_pub_field_missing PASS [ 0.056s] major struct_repr_c_removed PASS [ 0.058s] major struct_repr_transparent_removed PASS [ 0.151s] major struct_with_pub_fields_changed_type PASS [ 0.050s] major trait_missing PASS [ 0.051s] minor trait_must_use_added PASS [ 0.053s] major trait_unsafe_added PASS [ 0.054s] major trait_unsafe_removed PASS [ 0.057s] major tuple_struct_to_plain_struct PASS [ 0.052s] major unit_struct_changed_kind PASS [ 0.351s] major variant_marked_non_exhaustive Completed [ 12.708s] 40 checks; 40 passed, 0 unnecessary ```

Previous + "(item id, method name) -> (impl id, method id)" index: 9.1s lint wall-clock time

``` Checking aws-sdk-ec2 v0.24.0 -> v0.24.0 (no change) Starting 40 checks, 0 unnecessary PASS [ 1.123s] major auto_trait_impl_removed PASS [ 0.074s] major constructible_struct_adds_field PASS [ 0.209s] major constructible_struct_adds_private_field PASS [ 0.062s] major constructible_struct_changed_type PASS [ 1.047s] major derive_trait_impl_removed PASS [ 0.069s] major enum_marked_non_exhaustive PASS [ 0.065s] major enum_missing PASS [ 0.053s] minor enum_must_use_added PASS [ 0.051s] major enum_repr_c_removed PASS [ 0.053s] major enum_repr_int_changed PASS [ 0.053s] major enum_repr_int_removed PASS [ 0.054s] major enum_struct_variant_field_added PASS [ 0.054s] major enum_struct_variant_field_missing PASS [ 0.060s] major enum_variant_added PASS [ 0.192s] major enum_variant_missing PASS [ 0.073s] major function_const_removed PASS [ 0.109s] major function_missing PASS [ 0.111s] minor function_must_use_added PASS [ 0.106s] major function_parameter_count_changed PASS [ 0.100s] major function_unsafe_added PASS [ 0.173s] major inherent_method_const_removed PASS [ 0.763s] major inherent_method_missing PASS [ 0.165s] minor inherent_method_must_use_added PASS [ 1.080s] major inherent_method_unsafe_added PASS [ 1.323s] major method_parameter_count_changed PASS [ 0.392s] major sized_impl_removed PASS [ 0.099s] major struct_marked_non_exhaustive PASS [ 0.151s] major struct_missing PASS [ 0.066s] minor struct_must_use_added PASS [ 0.223s] major struct_pub_field_missing PASS [ 0.054s] major struct_repr_c_removed PASS [ 0.060s] major struct_repr_transparent_removed PASS [ 0.145s] major struct_with_pub_fields_changed_type PASS [ 0.047s] major trait_missing PASS [ 0.051s] minor trait_must_use_added PASS [ 0.053s] major trait_unsafe_added PASS [ 0.052s] major trait_unsafe_removed PASS [ 0.060s] major tuple_struct_to_plain_struct PASS [ 0.060s] major unit_struct_changed_kind PASS [ 0.364s] major variant_marked_non_exhaustive Completed [ 9.101s] 40 checks; 40 passed, 0 unnecessary ```

I need to clean up the code and add quite a few tests before this is ready for prime-time. But this already seems very encouraging, and I am sure that there's even more performance that we could squeeze out here if we chose to.

obi1kenobi commented 1 year ago

I did a slightly better job with the index construction, using HashMap instead and making them point to the item itself rather than to its Id (eliminating one more pointer hop), and this got down to a nice round number so I'm going leave it here.

8.0s lint wall-clock time, 2,272x faster overall

``` Checking aws-sdk-ec2 v0.24.0 -> v0.24.0 (no change) Starting 40 checks, 0 unnecessary PASS [ 0.812s] major auto_trait_impl_removed PASS [ 0.072s] major constructible_struct_adds_field PASS [ 0.207s] major constructible_struct_adds_private_field PASS [ 0.065s] major constructible_struct_changed_type PASS [ 1.018s] major derive_trait_impl_removed PASS [ 0.058s] major enum_marked_non_exhaustive PASS [ 0.062s] major enum_missing PASS [ 0.052s] minor enum_must_use_added PASS [ 0.053s] major enum_repr_c_removed PASS [ 0.053s] major enum_repr_int_changed PASS [ 0.053s] major enum_repr_int_removed PASS [ 0.061s] major enum_struct_variant_field_added PASS [ 0.069s] major enum_struct_variant_field_missing PASS [ 0.056s] major enum_variant_added PASS [ 0.159s] major enum_variant_missing PASS [ 0.073s] major function_const_removed PASS [ 0.104s] major function_missing PASS [ 0.098s] minor function_must_use_added PASS [ 0.100s] major function_parameter_count_changed PASS [ 0.125s] major function_unsafe_added PASS [ 0.171s] major inherent_method_const_removed PASS [ 0.627s] major inherent_method_missing PASS [ 0.159s] minor inherent_method_must_use_added PASS [ 0.829s] major inherent_method_unsafe_added PASS [ 1.086s] major method_parameter_count_changed PASS [ 0.370s] major sized_impl_removed PASS [ 0.084s] major struct_marked_non_exhaustive PASS [ 0.128s] major struct_missing PASS [ 0.072s] minor struct_must_use_added PASS [ 0.200s] major struct_pub_field_missing PASS [ 0.059s] major struct_repr_c_removed PASS [ 0.063s] major struct_repr_transparent_removed PASS [ 0.123s] major struct_with_pub_fields_changed_type PASS [ 0.051s] major trait_missing PASS [ 0.051s] minor trait_must_use_added PASS [ 0.050s] major trait_unsafe_added PASS [ 0.050s] major trait_unsafe_removed PASS [ 0.071s] major tuple_struct_to_plain_struct PASS [ 0.058s] major unit_struct_changed_kind PASS [ 0.350s] major variant_marked_non_exhaustive Completed [ 8.003s] 40 checks; 40 passed, 0 unnecessary ```

The slowest lint in the suite is now taking just a hair over one second to run, and is a nice round 4500x faster than before.

When we start running the lints in parallel with rayon, on most hardware it will take approximately the same time to parse (not generate! just parse!) the rustdoc JSON with serde as to run all the lints.

I'm pretty happy with that.

obi1kenobi commented 1 year ago

More details on how all of the above works here: https://predr.ag/blog/speeding-up-rust-semver-checking-by-over-2000x/

Now just a small matter of finishing the implementation (and testing!) of the new API so we can hit these numbers outside of a test setup.

obi1kenobi / cargo-semver-checks

Make CI runs faster #214

Describe your use case

Describe the solution you'd like

Alternatives, if applicable

Additional Context

208 should merge very soon, and might make it easier to not have to cache the `cargo install` step.

208 is nice to see!

obi1kenobi / cargo-semver-checks

Make CI runs faster #214

Describe your use case

Describe the solution you'd like

Alternatives, if applicable

Additional Context

208 should merge very soon, and might make it easier to not have to cache the cargo install step.

208 is nice to see!

208 should merge very soon, and might make it easier to not have to cache the `cargo install` step.