Closed lcnr closed 6 months ago
I don't expect that we're able to convert higher ranked types to some canonical representation
Ok, I have an idea on how to convert higher ranked types to some canonical representation. I just don't think that idea is very good. I also don't know how I would extend this to implication types in the future
To clarify, is
One
andTwo
are equal types with different type ids
the issue here, and the rest is just exploiting this to demonstrate the unsoundness? (At least this seems to break the guarantees I understand TypeId
to be providing... it just isn't exploitable without a way to substitute the "equal" types.)
If this holds, then the resolution is to ether give One
and Two
the same type id or to make One
and Two
nonequal types.
Giving One
and Two
the same type id would be disastrous, as it would allow e.g. dynamically casting for<'a> fn(&'a T) -> &'a T
to for<'a> fn(&'a T) -> &'static T
.
cc also @eddyb #95845, as using the v0 mangling for TypeId would erase the lifetimes from higher ranked types AIUI. The current impl of #95845 still uses the current hash, so wouldn't be problematic, but this is something to keep in mind w.r.t. the information contained in TypeId vs the v0 type mangling.
(The issue OP ccs itself; this is likely a typo?)
There is a lot of confusion here wrt all the pieces involved.
Both legacy and v0 mangling, and both the TypeId
hash and a v0-mangling-based implementation (like #95845), use the same encoding for higher-ranked lifetimes (it's why the v0 mangling mentions lifetimes at all!), though legacy symbols/type hashes only as the input to the hash.
You couldn't not mangle them because otherwise TypeId::of::<One>
and TypeId::of::<Two>
would clash in their symbol names (and that's actually how I found out that v0 mangling needed to reason about for<'a>
, when implementing it, heh).
That all uses "binder anonymization" so the fully normalized types are:
One
: for<'0> fn(&'0 (), &'0 ())
Two
: for<'0, '1> fn(&'0 (), &'1 ())
The issue here is this broken fragment of type unification code: https://github.com/rust-lang/rust/blob/cd282d7f75da9080fda0f1740a729516e7fbec68/compiler/rustc_infer/src/infer/equate.rs#L154-L159
This treats (T <: U) + (U <: T)
as meaning T = U
. But that's not sound at all!
Too much of Rust's typesystem depends on T = U
matching "identity" (judgmental syntactic equality if you want) with "enough normalization", so a definition equivalent to "subtyping both ways" only works if it's compatible, and in this case it's not.
I don't think there's an "obvious choice" for which of One
or Two
should be "the normal form" for the other (Two
seems more likely but it still feels dubious), so the correct thing to do (at least in the interim) is to stop using subtyping entirely when dealing with strong equality, and only support anonymizing the binders (which is the only thing done for the maximal normalization used by symbol names and TypeId
, tho sadly I don't think we always do it, yet).
Even if we normalized one of One
or Two
to the other, equality should still not use subtyping, and instead normalize to a common type first, to avoid the "unproven assertion" that "both-ways subtyping" implies "equality" (as opposed to the trivial truth in the other direction, which is given by subtyping reflexivity AFAIK).
Retitled the issue to reflect the situation as I understand it (TypeId
only exposes this the same way e.g. symbol names or any other type-equality-sensitive part of rustc might).
One
andTwo
are equal types with different type ids.
In which sense are these types "equal"? Two allows references with different lifetimes, One does not, so those look like different types to me.
Going off the issue title, are you saying they each are a subtype of the other? That is different from saying they are "equal" though. Might be good to update the issue description because this confused the heck out of me until I read https://github.com/rust-lang/rust/issues/97156#issuecomment-1130957415.
You are Cc'ing an issue in itself? probably a typo, not sure which issue you meant though.
Two allows references with different lifetimes, One does not, so those look like different types to me.
Well... from a user perspective any function that will be allowed as a function pointer of type One
will also be allowed as a function pointer of type Two
. Also any caller to One
can just also call Two
without any visible behaviour (if the function pointers actually point to the same function).
So they are structurally different, but behaviour wise they are the same. If we had sufficiently smart normalization we could normalize One
to Two
, just gets hard to do that normalization properly in the presence of more complicated variance (mutable references and interior mutability)
@RalfJung FWIW I changed the issue title, it was something else before.
In which sense are these types "equal"?
Specifically unification in the Equate
mode - we seem to have special-cased "each of the two types is a subtype of the other" to imply "equality" without justifying/enforcing it in a way the rest of the compiler - which mostly relies on Ty<'tcx>: Eq
(and I assume that sort of "syntactic match" is equivalent to judgmental equality), especially after post-monomorphization normalization.
@oli-obk The normalization wouldn't need to be perfectly clever. You would just perform it before checking for equality (without involving subtyping), just like we do with anonymizing the lifetimes bound by the for<...>
to avoid their order (or DefId
s) affecting any semantics.
That is, we have to remove the unsound equality-via-subtyping logic and try to cover any breakage with a subset of the possible higher-ranked normalization (e.g. if a lifetime only appears in the outermost fn
pointer type's input types, and its variance is the default for that position, so not invariant or otherwise flipped, then it can be split into one lifetime per position) - we could maybe even make this be part of anonymize_late_bound_regions
, since that has to replace every one of those lifetime positions anyway.
EDIT: I regret using the word "judgmental" (or its misspelling, apparently), because I just remembered that type equality judgments can definitely involve subtyping, that distinction is more about "being built-in" vs reified proofs of type equality (or "paths" in HoTT). ("syntactic" should be correct though, I'll switch to that)
A few thoughts:
Some parts of rustc do indeed assume that T1 <: T2
and T2 <: T1
imply T1 == T2
(I think of this as "semantic equality", in this sense, though I realize it's actually a distinct concept and I should probably find another term). However, for many Rust types, there is no canonical type T3
that both T1
and T2
can be converted to, or at least, we don't know of such a type (especially given the directions I would like to go with our types, I do not think that canonical types will exist in general, see below).
For the compiler itself, lacking a canonical type T3
is not a big issue, although it does make things inconvenient, since checking semantic equality is more complex. (Associated types throw a similar wrench into these things too, hence the desire for lazy normalization.)
The typeid
function however "works best" if canonical types exist, though I would say that for soundness they are not strictly required:
type_id::<T1>() == type_id::<T2>()
iff T1 == T2
type_id::<T1>() == type_id::<T2>()
implies T1 == T2
, but not vice versa. In other words, you kind of have to assume that, for any given type, there is some set of valid return values for type_id
and that type_id
chooses non-deterministically from that set (but each equality class of types has a distinct set).Whether or not the "nondeterministic type-id" is a problem depends on how you choose to draw the lines:
type_id
in const evaluation, because it is not deterministic (or, if we were to do so, we'd have to account for its non-deterministic return value, at least). This is the unsoundness described in the OP.TypeId
.Based on the above analysis, it seems like we can ...
TypeId
is non-deterministic. We could perhaps deprecate it and introduce a version that returns Option<TypeId>
, where Some
is only returned for types that have a canonical form. Alternatively, this new version could include a marker trait for "types with canonical forms" (i.e., with type-ids) -- in a way, the 'static
bound was meant to serve as that role, but as this bug shows, that's not sufficient.I'm not really aware of other solutions, I'm curious if folks have any ideas there.
Having two notions of equality (one more fine-grained than the other) is certainly possible, but needs some care to avoid accidentally using the wrong one in the wrong spot. I am not enough of a subtyping expert to know if there are any common pitfalls with this approach. It would of course be nicer, conceptually, to make subtyping antisymmetric, i.e., "a <= b /\ b <= a -> a == b".
I'm not really aware of other solutions, I'm curious if folks have any ideas there.
Some algebraic thoughts: whatever equality on types ends up being, the property we need is that all types in an equivalence class have the same type ID. The easiest way to achieve this is to pick a unique representative for each equivalence class, have a normalization function that returns said representative, and use its type ID. However, in principle it is conceivable that one could generate a suitable type ID directly, without having such a normalization function. I don't know if that is actually any easier in this case, but it might be worth thinking about.
Random thought:
For the specific case of for<'a> fn(&'a (), &'a ())
and for<'a, 'b> fn(&'a (), &'b ())
, could both of these be normalized to fn(&'empty (), &'empty ())
as described here?
Afaik we're currently moving towards removing ReEmpty
so that won't work :laughing:
also, for<'a> fn(&'a ())
needs to have a different TypeId
as fn(&'static ())
(as these are different types which are both 'static
) and as we erase regions during codegen, changing for<'a> fn(&'a ())
to fn(&'erased ())
would make it indistinguishable from fn(&'static ())
.
Afaik we're currently moving towards removing
ReEmpty
What's the reasoning behind this change? Any discussion I can read up on?
For context, from what I understand, 'empty
is needed for an impl <T> Trait for fn(T)
to include impl <T> Trait for fn(&T)
.
And that is needed to allow making coherence_leak_check
a hard error without breaking people's use cases.
And making coherence_leak_check
a hard error might be needed to make pattern types play well with type inference.
For context, from what I understand,
'empty
is needed for animpl <T> Trait for fn(T)
to includeimpl <T> Trait for fn(&T)
.
Well 'empty
is the reason why we even lint coherence_leak_check
right now afaik :sweat_smile:
I don't your idea proposed in the linked internals thread is possible:
T implements Trait. In this case, if any of T's supertypes or subtypes also implement Trait, those impls must come from the same impl block as the impl of Trait for T. Additionally, if U is a supertype of T, and U does not implement Trait, then none of U's supertypes may implement Trait. If any of these rules is violated, it's a coherence error.
I don't think this works, because not only does the same impl has to apply for coherence, but this must also recursively hold for all associated types. If we again look at for<'a> fn(&'a ())
and fn(&'static ())
: These have to be different types with different TypeId
s. But we currently have for<'a> fn(&'a ()) <: fn(&'static ())
.
What's the reasoning behind this change? Any discussion I can read up on?
Not sure, you can try asking about it in the #t-types stream on zulip :sweat_smile: i think that's mostly been mentioned as a sidenote during other discussions.
I don't think this works, because not only does the same impl has to apply for coherence, but this must also recursively hold for all associated types.
I'm not sure I understand what you are saying here?
a <: b
holds than if a: Trait
and b: Trait
holds then a
and b
must share the Trait
impl. The goal probably being that selection for a
and b
always has the same result.<a as Trait>::Assoc
and <b as Trait>::Assoc
both implement OtherTrait
, then these two must also share the same impl of OtherTrait
. Otherwise selection could differentiate between a
and b
again.I don't think this restriction can be enforced with the best example being const fn TypeId::of
(which we should stabilize at some point) for for<'a> fn(&'a ())
and fn(&'static ())
.
Does that make sense to you? I don't have a clear image of why you need that invariant, so I might not make much sense here
The "one impl
block" rule isn't necessary for soundness. It's meant to make some footguns harder to shoot, not to uphold some absolutely necessary invariants. To elaborate:
Let's say you have a trait Trait
, and a type T
, such that T: Trait
does not hold, but there exists a type U
such that T <: U
and U: Trait
. Then, the following code works just fine:
fn takes_a_u(_: U) {}
fn takes_a_t(t: T) {
// `t: T` coerced to supertype `U`
takes_a_u(t);
}
However, the following doesn't work today:
fn takes_impl_trait(_: impl Trait) {}
fn takes_a_t(t: T) {
// Compile error!
// `t: T` not coerced to supertype `U`
takes_impl_trait(t);
}
To make pattern types ergonomic, the second example most likely needs to compile. The issue is that there might be more than one possible U
that satisfies the trait bounds, so there needs to be a way to pick a "best" one among them. The proposed coherence rules are meant to ensure that:
IntoIterator
for arrays needed a per-edition hack because of the interaction with auto-ref; I want to avoid that kind of thing.The "one impl
block" rule is meant to ensure #2. But "difficult" doesn't need to mean "impossible", as type inference breakage is technically "minor breakage." It's OK in terms of soundness for trait impls to have different behavior for subtypes with different TypeId
's, but it should be hard to do accidentally.
Does that make sense?
can someone explain how this could be a blocking issue for const type_name
? naively one could think that the result of type_name
is purely for debugging and not intended to be compared against and has no stable value, so it shouldn't matter if it returns different values for the same type in different places...
can someone explain how this could be a blocking issue for const
type_name
?
using const type_name
you can move different type names for the same type into the trait system. While this mostly just surprising at runtime, it is actually unsound at compile time. The snippet in the issue description causes a segfault without using unsafe code.
As of #103291 I cannot manage to create an unsoundness repro for this since it's not possible to compare TypeId
at compile time anymore. nonetheless this is still an issue and should be fixed.
i think the real issue here is not that equal types have different TypeId
s, nor that we can compare TypeId
s at compile time. the issue stems from the interaction between generic_const_exprs
and const_type_id
.
probably the most obvious path to reconciling this issue in this view is disallowing reflexive methods TypeId::of
, std::any::type_name
(and maybe size_of
/align_of
) when defining an associated type, so in the issue's example we would get a compiler error:
impl<T: 'static> WithAssoc<()> for T where [(); <T as AssocCt>::ASSOC]: {
// ERROR: cannot use reflexive method `TypeId::of` in a const context of an associated type.
type Assoc = [u8; <T as AssocCt>::ASSOC];
}
I disagree, its extremely weird to have a function that returns different values when all the inputs are the same. if two types T
and U
are the same type then foo::<T>
and foo::<U>
should not do different things. TypeId::of
violates this which is an odd thing to have in the language.
Could we decide that TypeId::of<T>()
requires T: ~const Any
, where any T: 'static
is Any
, but const Any
requires T
to also have no higher-ranked lifetimes?
I have no idea if that's realistic to implement. I dislike that solution regardless, it's weird to have TypeId::of
have such different requirements when used at compile time when there is still weird behaviour when used at runtime. I don't think compile timeness is special here at all, soundness bugs are nice to demonstrate using the type system because you can't argue any unsafe code is wrong, its an unarguable way of demonstrating unsoundness.
The fact that the "weird behaviour" is unsound in the type system imo is a good indicator that this is just undesirable behaviour. I am confident you could create an unsoundness example involving unsafe code and library invariants that seem completely reasonable but break down from this bug.
I cannot see any issue here other than that equal types have unequal typeids, that's just wrong imo ^^
example:
type One = for<'a> fn(&'a (), &'a ());
type Two = for<'a, 'b> fn(&'a (), &'b ());
mod my_api {
use std::any::Any;
use std::marker::PhantomData;
pub struct Foo<T: 'static> {
a: &'static dyn Any,
_p: PhantomData<*mut T>, // invariant, the type of the `dyn Any`
}
impl<T: 'static> Foo<T> {
pub fn deref(&self) -> &'static T {
match self.a.downcast_ref::<T>() {
None => unsafe { std::hint::unreachable_unchecked() },
Some(a) => a,
}
}
pub fn new(a: T) -> Foo<T> {
Foo::<T> {
a: Box::leak(Box::new(a)),
_p: PhantomData,
}
}
}
}
use my_api::*;
fn main() {
let foo = Foo::<One>::new((|_, _| ()) as One);
foo.deref();
let foo: Foo<Two> = foo;
foo.deref();
}
Imo this is completely reasonable unsafe code but is currently fn main
has UB. Foo
is invariant over T
so the type cant change and dyn Any
has type T
. You should be able to assume that dyn Any
does infact have type T
. There is no compile time comparing of TypeId
anywhere in this code.
type-system only reproductions of unsoundness are nice because you cant look at some unsafe code and argue its wrong because actually the bug in the language isn't a bug. currently there's no way to write a type-system only repro afaik and the previous way used generic_const_exprs
but that doesnt mean that this bug is now fixed, or that generic_const_exprs is at fault.
Tbh given the above code example I'm just gonna remove the F-generic_const_exprs
and requires-nightly
labels since the TypeId weirdness is visible on stable in ways that can lead to unsoundness or crashes in code that i cant imagine us not expecting to work
By the way this makes ::qcell
's TCell{,Owner}
API unsound, as pointed out by @kvverti over Discord.
use ::qcell::{TCell, TCellOwner};
type Q1 = for<'any> fn(fn(&'any ()));
type Q2 = fn(fn(&'static ()));
let a = TCellOwner::<Q1>::new();
let b = TCellOwner::<Q2>::new(); // does not panic since `TypeId::of::<Q2>()` is distinct.
let x = TCell::new(42);
::core::mem::swap(
x.rw(&mut { a }),
x.rw(&mut { b }), // Q2 -> Q1 even though `TCellOwner<_>` is invariant
);
Adding to what @BoxyUwU mentioned, it is interesting to observe that we currently have TypeId : Eq
rather than TypeId : PartialEq
, which makes @nikomatsakis'
we have only one direction:
type_id::<T1>() == type_id::<T2>()
impliesT1 == T2
, but not vice versa
even more of a footgun for the unaware.
EDIT by @BoxyUwU playground
has UB from hitting the
unreachable_unchecked
becauseTypeId::of::<One>()
is not the same asTypeId::of::<Two>()
despite them being considered the same types by the type checker. Originally this was thought to be a nightly-only issue withfeature(generic_const_exprs)
but actually the weird behaviour ofTypeId
can be seen on stable and result in crashes or UB in unsafe code.original description follows below:
TypeId
being different for types which are considered equal types allows us to take change the value of a projection by switching between the equal types in its substs and observing that change by looking at theirTypeId
. This is possible as switching between equal types is allowed even in invariant positions.This means that stabilizing
const TypeId::of
and allowing constants to flow into the type system, e.g. some minimal version offeature(generic_const_exprs)
, will be currently unsound.I have no idea on how to fix this. I don't expect that we're able to convert higher ranked types to some canonical representation. Ah well, cc @rust-lang/project-const-generics @nikomatsakis