rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.61k stars 12.48k forks source link

switch to `EarlyBinder` as default #105779

Closed lcnr closed 1 year ago

lcnr commented 1 year ago

implement https://github.com/rust-lang/types-team/issues/78

Change all queries which currently have a bound_X variant to return EarlyBinder<T> by default and remove the bound_X version. Add a method fn EarlyBinder::<T>::subst_identity(self) -> T if these queries are used in the identity context and don't need to actually substitute anything.

Not having EarlyBinder be the default can very easily result in incorrect uses of these queries, e.g. https://github.com/rust-lang/rust/blob/736c675d2ab65bcde6554e1b73340c2dbc27c85a/compiler/rustc_hir_analysis/src/astconv/mod.rs#L2720 which I found while reviewing https://github.com/rust-lang/rust/pull/101947. The normalize_ty normalizes type_of(def_id) in the wrong environment as we only call subst afterwards.

kylematsuda commented 1 year ago

@rustbot claim

kylematsuda commented 1 year ago

Hi @lcnr, I was able to check this out a bit and I think I understand what needs to be done here. Just to be sure: the queries that we want to change correspond to all of the bound_X methods on TyCtx?

I played around with this a bit on the fn_sig query and was able to get x.py check working again for the whole repo again after changing fn_sig's return type to EarlyBinder<..>, so I think I have an idea of how to get started on the other queries.

For some of the queries (namely fn_sig, type_of, impl_trait_ref, and [explicit_]predicates_of), it looks like the non-bound version is used in many more places than the bound version. I just wanted to double check that these queries should also be changed?

lcnr commented 1 year ago

Yes that sounds correct to me.

There are quite a bit of places where we currently implicitly instantiate_identity so this code adds some additional noise. I still think that it is worth it as it clarifies the intention here and sill requires people to quickly think whether they're in the right context.

When accessing parts of the return type of these queries which don't access generic parameters, e.g. the number of arguments or c-variadic for fn_sig, and we're also not in the identity context, please use skip_binder instead of instantiate_identity. It's fine to get that wrong at some places as we should catch it during review. E.g. the following should use skip_binder() as its talking about a fn_sig from outside of the current item. Alternatively, add a method fn abi(self) to EarlyBinder<ty::PolyFnSig<'tcx>>.

https://github.com/rust-lang/rust/blob/7f42e58effa3871dda6a41e250dea60cf88868ca/compiler/rustc_hir_typeck/src/expr.rs#L546

kylematsuda commented 1 year ago

Ok great, thanks for all the tips! I'll work on this and let you know if I get confused.

kylematsuda commented 1 year ago

Hi, I’ve been playing with this and have been able to implement the changes to the queries for some of the most used ones (fn_sig, type_of, impl_trait_ref, predicates_of, …). I've checked that ui tests are passing.

I want to clean some things up and then open a PR, but I have a few questions:

  1. What does “identity context” mean? I know it’s probably not necessary for me to understand this but I also don’t want to make this harder to review by doing the wrong thing everywhere :smiley:.
    1. Here’s my current understanding: The compiler has the concept of early and late bound parameters. EarlyBinder represents an item that might have generics, but before early parameters are substituted. For example, consider a function fn foo<T>() { ... }. If we are looking at a call to foo like let x = foo<i32>(), the type of T is known to be i32. In this case, we call EarlyBinder::subst to set T = i32. If instead we’re inside of the declaration of foo, T could be anything, so we substitute it with a placeholder type T by using EarlyBinder::subst_identity. When we make this “identity substitution”, we are in the identity context. In other words, we're in the identity context when we're trying to "look at" the item definition, not at any usage/call/instantiation. Is this roughly correct?
    2. EarlyBinder::skip_binder would be implemented the same way as EarlyBinder::subst_identity, but means something different: “I am explicitly ignoring the types of generic parameters” vs “I am substituting placeholder types for the generic parameters”. (Side note: can we call this skip_early_binder instead of skip_binder? Wondering if it is confusing with Binder::skip_binder.)
  2. Is it necessary to add EarlyBinder to the predicates_of query? Currently, predicates_of returns a GenericPredicates, and GenericPredicates::instantiate_X is almost always immediately called. From what I can tell, the instantiate_X methods are doing conceptually the same thing as EarlyBinder::subst (and even call it internally). After adding EarlyBinder, we start getting things that look like tcx.predicates_of(..).subst_identity().instantiate_identity(), or even tcx.predicates_of(..).subst_identity().instantiate(tcx, subst), which looks contradictory to me. Could predicates_of be left as-is? We could also consider removing bound_predicates_of, which is only used in 2 places in the code (although I don’t really understand what’s happening at either of those places).
  3. Should the types stored in rustc_metadata also be changed to use EarlyBinder? I assume that the metadata is supposed to stay synced with the corresponding TyCtxt queries, which would mean that these should be changed, but I wanted to check.
  4. Any tips on structuring the PR? Here is what I was thinking:
    1. Add EarlyBinder::subst_identity, EarlyBinder::skip_early_binder, etc
    2. Change calls of query X to bound_X, and add the necessary subst_identity or skip_early_binder calls
    3. Change the query X return type to have EarlyBinder; rename all usages of bound_X to X; update the types stored in the metadata
    4. Remove bound_X method from TyCtxt

Sorry in advance for so many questions :sweat_smile:. If this general plan sounds alright, I'll open a PR. Thanks!

lcnr commented 1 year ago

sorry for not replying earlier. I have been on vacation.

1.i. mostly correct. EarlyBinder is used for all generics of items except for late-bound lifetimes on functions, to ensure that we don't forget to provide generic arguments when using these items. In rustc, we use ty::Param (ty::ConstKind::Param/ty::ReEarlyBound) to represent both the universally bound variables, of for<T> fn foo<T> whenever we use foo, and the universal placeholder when we're inside of foo, e.g. when typechecking that function.

This may not help too much, but ty::Param is equivalent to either ty::Bound or ty::Placeholder depending on whether we're "outside" or "inside" of a given item. EarlyBinder is used when we're outside of an item, and subst_identity can be used while we're inside.

1.ii. yes, as ty::Param is used to represent generic parameters, both when used as bound variables and when used as placeholders, subst_identity is just a noop and there is only a conceptual difference between it and skip_binder. EarlyBinder and Binder should ideally be the same thing, so I prefer using skip_binder for both as it conceptually represents the same operation both times.

  1. GenericPredicates is interesting. In the past, EarlyBinder was always implicit, so GenericPredicates ended up having the same concept in instantiate. Having EarlyBinder<GenericPredicates> isn't really meaningful unless you directly access the predicates field. It may make sense to keep using GenericPredicates but change predicates to be &'tcx [EarlyBinder<(Predicate<'tcx>, Span)>],. Alternatively it's fine to ignore that query for now.

  2. :thinking: yes, we should probably move those to EarlyBinder<T> as well. This should not change the runtime behavior though as EarlyBinder itself doesn't contain any data. You may have to implement some additional traits there or whatever, would have to check :grin:

  3. iii. and iv. can be done in the same commit. Apart from that this seems like the best approach :+1:

Thanks for working on this :heart: it feels difficult to explain 1. so I hope that it made at least some sense. Feel free to reach out about that on zulip if you have any questions. Explaining it in sync is easier :grin:

kylematsuda commented 1 year ago

Thanks so much for the detailed response! This helps a lot, I really appreciate it. (And sorry to have asked so many questions right before/during the holidays...:sweat_smile:). I should have some time this week to work on this :+1:

lcnr commented 1 year ago

This has been implemented for const_param_default and impl_trait_ref by #106696. There are still other queries which need the same work.

lcnr commented 1 year ago

@kylematsuda are you still working on this? going to unassign you but definitely don't hesitate to reassign yourself if you intend to continue working on it.

Thanks a lot for your good work here ❤️

kylematsuda commented 1 year ago

Ah sorry @lcnr! I got a bit busy and dropped the ball on this... but I should have time to work on it more now. Thanks for the reminder :sweat_smile:

Looking in the nightly docs, here are the bound_X queries that remain:

Also I think at some point @BoxyUwU had suggested changing subst_and_normalize_erasing_regions to take EarlyBinder<T> instead of T, so I'll plan to do that too.

I have some questions about a few of these but I might just try and knock out the more straightforward-looking ones first.

@rustbot claim

kylematsuda commented 1 year ago

After merging #111410, all of the existing bound_X methods on tcx have been removed. Thank you to everyone who helped review PRs, and especially @lcnr for mentoring this issue!

There is a bit of follow-up work that can be done (I'm happy to work on some of these if it would be helpful):

  1. Clippy: in review, both @lcnr and @BoxyUwU noticed places in Clippy where we probably should use subst instead of subst_identity, and suggested opening an issue on Clippy
  2. Further bubbling up EarlyBinder: We started this a bit in #110297, but there might be more interfaces that should have EarlyBinder added. Also, there might be other queries that logically should return EarlyBinder, but don't have a bound_ version, so haven't been converted yet
  3. Other things?
lcnr commented 1 year ago

for clippy I am not sure how helpful opening an issue like "hey, some of the subst_identity calls may be wrong" would be. Might be good to add a bunch of FIXMEs to the code instead (or directly fix them when doing so). I don't have the time to do so myself atm.

Further bubbling up EarlyBinder is good but probably can't be tracked in this issue

I think it would also be valuable to try to make the value in EarlyBinder private and require users to explicitly use skip_binder whenever they access the inner value.