rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.96k stars 12.69k forks source link

Support polymorphic InstanceDef's by avoiding double substitution. #69925

Closed eddyb closed 4 years ago

eddyb commented 4 years ago

69036 papers over a problem in the use of MIR shim bodies, (i.e. applying Instance's substs), by preventing Instance::resolve from succeeding if any substitution could later occur.

A MIR shim's body already encapsulates the final types (i.e. monomorphic for codegen, modulo "polymorphization" work), so we shouldn't substitute it any further (in codegen, miri, or MIR inlining).

Once we address this, we could inline (or codegen, given "polymorphization") slightly polymorphic variants of these shims, but Instance::resolve would still need to enforce the minimum requirements for the shim MIR body being built in the first place, i.e.:

cc @nikomatsakis @davidtwco

nikomatsakis commented 4 years ago

So @davidtwco, @eddyb and I spent a while discussing this on Zulip in an effort to better understand what's going on. Folks might find this topic or this topic helpful.

Let me try to summarize my understanding of (a) how things work and (b) the problem:

Currently, an Instance is an InstanceDef plus some substitutions. The MIR for an instance is created once per instance-def, but we then apply the substitutions in two places: (1) to figure out the signature of the MIR function and (2) to the body, during monomorphization (which may, e.g., allow trait calls to be resolved to particular impls and so forth).

The problem is that for some kinds of InstanceDef, notably DropGlue, we don't really have a rich enough set of variants. So what we do is a kind of "cheat" where we encode a Ty that contains the details we need to operate on -- for example, the DropGlue variant includes the type being dropped.

Note: Some of the text below is kind of confused in its particulars, so striking out, but still relevant-ish.

The problem here is kind of hard to summarize. It's a couple of 'conflicting' assumptions that might individually be ok but interact poorly. For one thing, when we apply substitutions to an Instance, we do not alter the InstanceDef, but just the substs (since the InstanceDef is meant to be the "generic form" of the thing, with the substs specifying the values for its generic parameters). But also the substs are always meant to correspond to the generics of the instance_def.def_id() result. But for things like drop-glue, the def-id is for the drop_in_place function, but the InstanceDef also includes part of the substs (the type being dropped). So if we substitute into substs, that gets "out of sync" with the InstanceDef. e.g., if we started out with drop-glue for Vec<T> and then substituted T => Vec<U>, we'd have a substs with Vec<Vec<U>> but the instance-def would still say Vec<T>. This is then a problem because the two sets of types are expressed in terms of a different set of in-scope type parameters, which is particularly bad since the mir body will be in terms of one and the MIR signatures in terms of the other.

There is no "easy" fix here, but there are some refatorings to be done.

One very simple fix is to (a) not generate drop-glue variants until we have a "fully monomorphic" (or "ground") type that never requires substitution. This is #69036. This "staves off" the problem by not creating the problematic InstanceDef and leaving things unresolved.

Another might be refactoring how we represent types to be more like chalk, where we would have some type ApplicationTy that encapsulates "all rust types", more or less, and which factors out the "substitutions" for them. Then the drop glue could include ApplicationTy and not a fully general type (and hence wouldn't require substitution).

Have to stop now, but there would be other possibilities too. (e.g., extending DefId to subsume IsntanceDef)

eddyb commented 4 years ago

For one thing, when we apply substitutions to an Instance, we do not alter the InstanceDef, but just the substs

If you mean calling .subst on Instance that's a different bug that we should fix and that I wasn't aware of.

What we were talking about was tcx.instance_mir(instance.def).subst(tcx, instance.substs) (or rather, substituting parts of that mir::Body, in order to monomorphize).

nikomatsakis commented 4 years ago

If you mean calling .subst on Instance that's a different bug that we should fix and that I wasn't aware of.

No, I think I was wrong and didn't quite understand correctly (good to write things out, I suppose). In particular, Instance does implement TypeFoldable, but it does fold the types in the def, so the mismatch I was referring to doesn't quite happen that way.

In any case, the core problem is still (in my mind) one of namespaces:

The "public signature" of an Instance is expressed in terms of a def-id which carries attached generics G, and the substs are the values for those generic parameters (in terms, potentially, of some other set of in-scope generics G1).

In some cases, the body of the MIR is also expressed in terms of the generics G.

But in other cases (notably drop-glue) the body of the MIR is expressed in terms of the generics G1.

This is causing use ICEs and confusion because we do the wrong substitutions. In particular, the example of Vec<T> being substituted with T => Vec<T>, and leading to Vec<Vec<T>> is caused by treating the body as if it was in terms of G when it is in terms of G1 (and hence no substitution is required to bring it from G to G1).

For the moment, we are documenting this situation and working around it, but not changing the underlying mismatch. Fixing that underlying mismatch is desirable but hard -- i.e., a longer term proposition -- and that is precisely what this issue represents.