Support flychecking only the current crate

P1n3appl3 commented 2 years ago

The issue with checking all crates

Currently when rust-analyzer runs a flycheck it always runs on the entire workspace. In large workspaces (or in my case a large rust-project.json project) running a check build on many crates can significantly increase the time it takes to get results for the current crate. This is especially an issue in build systems that don't support cargo's functionality of streaming diagnostics over stdout as it goes, so if 1 crate in the workspace takes on the order of minutes to run a check build and the rest take milliseconds, you end up having to wait minutes every time a transitive dependency of that long-check-time crate changes to get diagnostics in your editor. Even if you do have streaming diagnostics, not knowing which crate has been modified means the build system can't schedule the check builds to prioritize the one you're actually editing, which causes the same problem of waiting for lots of other crates to check and harming your time-to-first-diagnostic.

Potential solution

As I understand it rust-analyzer has the required info (the path of the modified file) to only run a check build on the single crate that changed. In the case of a cargo command it could do this automatically by mapping the changed file to the changed crate and passing -p crate_name rather than --workspace to the flycheck subcommand invocation. For rust-project.json based build systems, if checkOnSave.overrideCommand were to expose some sort of variable expansions like $crate_name, $source_root, and/or $changed_file then it would be possible for GN/Bazel/Buck/etc. to use that info to only perform a check build on the modified crate.

Example: fuchsia's codebase

In fuchsia's codebase (where we use rust-project.json with GN) we already have support for this "map $changed_file to the changed target and only run a check build on that crate" functionality, but it can't be used from the editor because there's no way for rust-analyzer to pass the path of the modified file through to our check build wrapper script. Passing that info along makes the difference between "check on save" being viable or not for our developers due to the latency of checking potentially thousands of crates (including outliers that take minutes to check) vs just one.

Config/UI concerns

I realize that the vast majority of cargo workspaces aren't large enough that this is an issue, and it's possible that people have come to expect seeing errors in downstream crates from the same workspace when editing a dependency, so I think it'd make sense to still default to the "check all crates" behavior. For huge workspaces and monorepos like ours however, getting diagnostics on save just for the current crate, and having the ability to manually trigger workspace-wide checks in the editor (#12870) would be a preferable workflow.

The least intrusive way I can think of to add the functionality would be to introduce a boolean flag like checkOnSave.onlyCurrentCrate which makes cargo based flychecks automatically use -p crate_name instead of --workspace, and enables the expansion of those variables for checkOnSave.overrideCommand. Alternatively, if we're pretty confident that noone's overrideCommand contains the strings that we choose for variable expansion, we could just enable the expansion by default in that case.

I'd love some guidance on whether this is actually a good idea or if there are issues with trying to unambiguously map source files to crate roots/names, or other issues with how LSP diagnostics requests work. Also if there are complications with getting cargo commands to work with this functionality I'd be fine putting off that work and just implementing support for rust-project.json + custom flycheck commands, both because that's what we'd need for fuchsia and because rust-project.json based projects tend to be more monorepo-y so they'd care about this feature more.

I also haven't considered the non-flycheck diagnostics that rust-analyzer provides itself. Do those checks also currently run on the entire workspace (and if so is this affected by cachePriming.enable)? Would it make sense to tie the two such that checkOnSave.onlyCurrentCrate applies to both the flycheck and native rust-analyzer diagnostics?

Veykril commented 2 years ago

Having this sounds reasonable, I assume a config setting makes the most sense here yes. overrideCommand currently completely overrides things though, effectively ignoring all the other settings. We might want to look into being able to combine the various flags with the overrideCommand somehow, though that feels like a separate task.

That said, this shouldn't be too difficult to implement. We can easily map a file to the crates it belongs to, note the fact that a file can belong to multiple crates, not just one.

The native rust-analyzer diagnostics only run in the opened files I believe (or rather for those files that the client asks diagnostics for I think).

On another note, the main problem with only checking the one crate is that crates that depend on that crate won't get checked and might contain diagnostics, maybe it also makes sense to add another option (or combine them really) to check a crate and all its dependents.

jhgg commented 2 years ago

On another note, the main problem with only checking the one crate is that crates that depend on that crate won't get checked and might contain diagnostics, maybe it also makes sense to add another option (or combine them really) to check a crate and all its dependents.

I wonder if it's possible to split this into 2 cargo check invocations. The first one checking current crate for more responsive diagnostics, and once that completes doing a cargo check for dependent crates.

jhgg commented 2 years ago

Would be good to get some concrete guidance on this. I might want to try to potentially tackle this over the weekend. At work, we have a rather large workspace that's growing by the day, and this would significantly improve developer experience in terms of being able to save + see errors as fast as possible.

I think my plan might be to split the check into 2 phases:

1) Any crates that the file belongs to directly will be invoked w/ cargo check, diagnostics will be published immediately to the user after this phase completes. 2) If phase 1 is not successful, phase 2 is not initiated, as we cannot check dependencies when the dependent crates are broken. But, if phase 1 does succeed, we will re-run cargo check with the list of dependent crates to ensure they did not break. Diagnostics would then be updated to provide more errors if they exist.

There are however some hairy bits like, what if 2 files from 2 different crates change? For example, if I was to run a "rename function" refactor that might touch many crates, then hit "save all files" in the editor. I am unsure what to do about this... we could potentially track "dirty" files between cargo checks, and even involve a small delay incase many saves are coming in at once, in order to expand the crates checked in phases 1 & 2. Additionally, we could track "dirty" crates between when the check started, and when it finished, to see if we need to re-check because a save happening during phase 1 or 2 may have caused the result to be invalid.

Veykril commented 2 years ago

https://github.com/rust-lang/rust-analyzer/pull/12808 might be helpful, since that changed flychecking all workspaces to flychecking workspace whose files have changed.

I'd say if more than one crate (for a given workspace) has changed we should just fallback to checking that entire workspace. Doing more complex stuff there doesn't seem to be too beneficial to me.

Note that this feature will probably require a few more invasive refactors than #12808, since flycheck currently assumes per workspace checking to occur only.

jhgg commented 2 years ago

I'd say if more than one crate (for a given workspace) has changed we should just fallback to checking that entire workspace. Doing more complex stuff there doesn't seem to be too beneficial to me.

Yeah that is fair. Falling back to the more broad and slow for the rare case makes sense.

P1n3appl3 commented 2 years ago

I've begun to work on implementation here, and put up a WIP PR with the initial changes: #12994

I'm working on the rust-project.json path (custom flycheck overrideCommand) first because that's what I use, and in that case I think it'd make sense for the "single crate flycheck" command to be an entirely different config setting. We could leave it as one command and expand variables to some special value for the full worksapce checks, but that seems pretty unwieldy so I'd only go with that option if adding the extra config is deemed unacceptable. For cargo it's probably possible to use one user provided cargo command + extra args and fill in the necessary -p <name> or --workspace internally.

@Veykril you brought up that we could potentially check the downstream deps from the changed file as well, and i think that's a good idea for cargo, but i'm less sure of whether it's a good idea for rust-project.json. For that to work you'd want the overrideCommand to support some sort of variable-length argument expansion like ./my-check-script $list-of-crates, and it may make more sense to leave deriving that list up to the specific build system that's being used to run the checkbuild anyways.

@jhgg if you were planning on implementing this as well feel free to email me to coordinate/pair so we don't duplicate effort. You seem interested in the cargo path so I'd be happy to just get the rust-project.json path working and let you handle cargo.

jhgg commented 2 years ago

@P1n3appl3 i was going to but I didn't start much yet! Glad to see your progress on this! I'll leave it to you!

alper commented 1 year ago

Is this why my code is blocked 10 seconds every time I save something? For me this is a recent development. I'm pretty sure it wasn't always so.

bjorn3 commented 1 year ago

This happens asynchronously. Unless your system is overloaded, rust-analyzer running cargo check in the background will never block anything in the editor.

alper commented 1 year ago

Every other time when I try to save Code pops this dialog and blocks saving until the check is done: CleanShot 2022-11-20 at 22 08 05

cargo fmt takes less than a second for me.

jhgg commented 1 year ago

@rustbot claim

jhgg commented 1 year ago

I'm going to take a look at this again. I am going to experiment with how to get the "if a crate has no dependencies, only cargo check a single crate" approach.

However, I do think that a more elaborate approach may be warranted. Our workspace at work has grown quite significantly, and being able to run minimal cargo-checks would be a boon in developer productivity. Essentially, it can take a significant amount of time (10-15 seconds) for cargo check to decide to do nothing in our workspace, which means in some cases, flycheck can take quite a while to show diagnostics.

I think that a viable approach to improving responsiveness here would be to use rust-analyzer's knowledge of the crate graph to run cargo check -p [crates to re-check] which, when run manually our workspace finishes significantly faster (minimally <2 seconds).

Veykril commented 1 year ago

We do have a way to fetch reverse dependencies from the crate graph rather easily already. The main issue here is restructuring our flycheck infra I believe. Right now we spawn a single worker per workspace, and that worker assumes to just run for the given workspace only. So we probably would need to introduce a work queue instead here.

jhgg commented 1 year ago

Yeah I was thinking up the design of that, where basically the fly-check worker would receive lists of crates it needs to re-check, and it would be able to essentially converge on checking all those crates.

Veykril commented 1 year ago

That sound roughly like what I would think of (though it might be nice to still support workspace level checking via cargo as well) . Another issue to be aware of then, https://github.com/rust-lang/rust-analyzer/issues/8631 which might impact that design

davidbarsky commented 1 year ago

Hey @jhgg! I've hacked something together yesterday on a whim and only realized today that that you claimed this issue. I'm not sure how you'd like to tackle this, but I'd be happy to collaborate and/or keep my changes private—I don't want clobber your work unfairly. I've basically implemented cargo check -p [list-of-rev-deps-from-changed-crate] by adding another variant to FlycheckConfig that accepts a list of crates (this is where it differs from https://github.com/rust-lang/rust-analyzer/pull/12808). I think if I flesh out the approach, it's better than a single global Flycheck command and I'd be happy to see it through.

This approach works really well on when I tested it against Buck2 and rust-analyzer—the time-to-diagnostics is really fast—but the main issue I encountered was that workspace diagnostics for all crates would be replaced by the current crate's diagnostics as I've moved around the workspace. I think preserving those diagnostics is probably easy, but I'm a little worried about showing expired diagnostics in the editor (but I suppose it's probably going to be rare, on second thought).

Anyways, Jake—let me know if you'd like to collaborate on this.

jhgg commented 1 year ago

Hey again David! In my adventures in resolving this issue I found a limitation in cargo, namely that, approaching it with the strategy of "cargo check -p [list of packages]" would cause spurious re-compilation of dependencies, given the list of feature flags for dependencies can change depending on which crates are provided to cargo.

This is however, probably not a problem with buck2! If you'd like, please take over this task. I'm not currently working on a solution here.

pacak commented 1 year ago

This is however, probably not a problem with buck2!

This is not a problem with cargo too if the features are unified and there are several tools for unification...

jhgg commented 1 year ago

This is however, probably not a problem with buck2!

This is not a problem with cargo too if the features are unified and there are several tools for unification...

Care to elaborate?

pacak commented 1 year ago

Care to elaborate?

Yea, it's not a complete solution, just a bunch of ideas.

Recompilation happens when features are not unified across the workspace, it is possible to check if they are unified (a single request to cargo-metadata plus some graph traversing) and there are tools that modify workspace Cargo.toml files so that features are unified with minimal changes other than that - I'm using a tool I wrote cargo-hackerman, but there are other tools like that.

Now, building in unification into rust-analyzer is probably outside of its scope, building in a check if they are unified or not might be okay, not sure.

If a workspace is big enough that flychecking all the things becomes too slow - feature unification can bring some benefits as well.

A potential solution might be to make current crate flychecking an opt-in thing, building in a unification check and refusing to do a single crate flycheck if features are not unified pointing to issues and/or documentation to how to fix it. There might be some corner cases I'm missing.

https://crates.io/crates/cargo-hackerman

P1n3appl3 commented 6 months ago

Now that both #15476 and #16510 are merged I think this can be closed.

If you're using cargo, you can set rust-analyzer.check.workspace to false.

For non-cargo setups you can use $saved_file as part of your rust-analyzer.check.overrideCommand and use that info to only check the crate corresponding to the saved file. This might change slightly down the road when #15892 is addressed, but any solution should respect this use case.

lnicola commented 6 months ago

For non-cargo setups you can use $saved_file as part of your rust-analyzer.check.overrideCommand and use that info to only check the crate corresponding to the saved file.

I don't think we have that interpolation, though.

lnicola commented 6 months ago

Oh, sorry, I missed #15476.

rust-lang / rust-analyzer