astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
30.97k stars 1.03k forks source link

[red-knot] ensure that type inference on a high-durability file is also high durability #13169

Open carljm opened 2 weeks ago

carljm commented 2 weeks ago

If we can ensure this is the case, it should help our incremental inference perform better.

One issue here is that low-durability files (in your workspace) can shadow high-durability files (in the stdlib), and in principle this could impact the type inference in the stdlib. This means that any resolve_module is (I think) currently a low-durability query, because it will always access some low-durability import search paths.

If it helps performance enough, we could decide that we never want type inference within the stdlib to respect shadowed stdlib modules, so that we can keep type inference of the stdlib all high-durability, and not need to revalidate it in depth on incremental changes to user code.

This maybe should go hand-in-hand with always warning or erroring if a user module does shadow a stdlib module.

MichaReiser commented 2 weeks ago

So, the issue here is that the high durability of stdlib files doesn't yield its full potential because the type inference result of any module with imports depends on resolve_module which always has durability low (because we first start resolving local paths)?

One alternative is to give File::status a higher durability. That's the only field resolve_module depends on. Another alternative is to move the module resolver out of salsa and model Modules as inputs. But that has obvious downsides.

carljm commented 2 weeks ago

So, the issue here is that the high durability of stdlib files doesn't yield its full potential because the type inference result of any module with imports depends on resolve_module which always has durability low (because we first start resolving local paths)?

Yes, that's right.

One alternative is to give File::status a higher durability. That's the only field resolve_module depends on.

Wouldn't that mean that adding or removing a file in your workspace root would require re-checking all high-durability queries? I guess maybe that could be fine, assuming adding or removing files doesn't happen that often.

The solution I was proposing was that we would instead have imports from the stdlib not even consider search paths other than the stdlib. Which is technically wrong (since shadowing is possible), but practically right if we consider shadowing the stdlib itself to be an error (which I think pyright at least does.)

MichaReiser commented 1 week ago

Wouldn't that mean that adding or removing a file in your workspace root would require re-checking all high-durability queries? I guess maybe that could be fine, assuming adding or removing files doesn't happen that often.

Yes, it doesn't solve the problem entirely, but it might be a desired change anyway because it avoids a deep-verify for all third-party code whenever a first-party module is changed.

The solution I was proposing was that we would instead have imports from the stdlib not even consider search paths other than the stdlib. Which is technically wrong (since shadowing is possible), but practically right if we consider shadowing the stdlib itself to be an error (which I think pyright at least does.)

That makes sense. I think we may want to do both where what I suggested reduces deep-verification for third-party code and your suggestion avoids deep-verification for stdlib modules. The only thing we need to be mindful of is that forbidding stdlib modules altogether may be overly-strict for some projects.