Open Manishearth opened 1 year ago
From talking to @BoxyUwU; there's an alternate way to look at this:
&'outer C
implies C: 'outer
C: 'outer
implies Contra<'inner>: 'outer
when C
is Contra<'inner>
(fn(&'inner _)
)Contra<'inner>: 'outer
implies 'inner: 'outer
This all makes sense and is consistent. The tricky thing is the first line: implied bounds come from well-formedness, and there's not an actual WF reason for contravariant lifetimes to behave that way: &'outer fn(&'inner _)
does not need any particular bounds to be valid.
In fact, the syntax C: 'outer
when lifetimes contained in C are contravariant is itself kinda tricky, because it both means "all lifetimes contained in C outlive 'outer
" and it means "all borrows contained in C outlive 'outer
", but these mean different things when you're talking about contravariant lifetimes. In general in Rust it means the former, but the ambiguity can be confusing.
I think the real issue here is:
&'outer &'inner _
to imply 'inner: 'outer
. There is not one for the equivalent contravariant type. TWe should probably better document implied bounds, and consider relaxing how they work around contravariant lifetimes.
(as far as the Yoke issue, I do have a fix I can apply; though I'd rather not have to, I know it's going to take a while for there to be motion on this in any direction)
FWIW I'm happy to sit down and document WF/implied bounds if someone can tell me where it should go and help me with specifics, like, which things are WFness and which bounds are implied.
- There is obviously a WF reason for
&'outer &'inner _
to imply'inner: 'outer
. There is not one for the equivalent contravariant type.
I don't think this is an issue with variance. The real issue here is that all lifetimes that appear on trait objects and fn pointers are treated as "outlives components" of the underlying type. This can occur with invariant and covariant lifetimes as well. Here is an example with invariant lifetime 'a
:
trait Trait<'a> {}
fn test<'a>(_: &'static dyn Trait<'a>) { // implies 'a: 'static !!
None::<&'static &'a ()>; // check 'a: 'static
}
This might not be desired and can lead to surprising behavior, like dyn Fn(&'a str) -> &'a str + 'static
is not really static unless 'a: 'static
, but it's not really a soundness bug, I believe.
Implied bounds do affect universal quantification
Unsafe code may rely on universal quantification in ways that implied bounds will mess with it
Universal quantifiers for trait bounds is bounded by the condition that trait input types must be well-formed. For example, the trait bound for<'a> Fn(&'a T)
actually means for any lifetime 'a
such that &'a T
is well formed, and this may not include 'static
.
I don't think this is an issue with variance. The real issue here is that all lifetimes that appear on trait objects and fn pointers are treated as "outlives components" of the underlying type.
That is in fact what is being referred to here with respect to "variance", as the actual way of expressing this conversation involves wheeling out predicates in a language that is not Rust (logic) because Rust only interprets lifetime bounds in contextual manners... i.e. variance... and thus having an overly consistent behavior in terms of creating outlives requirements here is inconsistent with how the lifetimes in question are read and written.
That is a problem whether or not it is a strict "soundness issue" in the compiler, because the underlying thesis of Rust is that the combination of unsafe code with lifetimes can be used to create programs that constrain behavior to what is sound. And that means it needs to, as much as possible, not fall down into a pile of miserable edge-cases when it comes to actually trying to do that.
I have fixed the tags. The categories are made for humanity, humanity was not made for the categories. Let's not bog down this issue's comment thread with further haggling over whether it is "unsound" or merely "surprising", please.
I don’t see good ways to change the current behavior here, but feel free to point out where I might not be seeing the way out.
The proposition of this issue – or perhaps feature request – is that for<'a> FnOnce(&'a Foo<'b>)
should mean, in some sort of pseudo-syntax adding where
clauses to the for
:
for<'a, where 'b: 'a> FnOnce(&'a Foo<'b>)
if Foo<'b>
is covariant in 'b
(unchanged case, as is status quo),for<'a /* unbounded, for all 'a */> FnOnce(&'a Foo<'b>)
if Foo<'b>
is contravariant in 'b
(instead of the status quo, where this case is the same as the covariant case), andIf the invariant case where for<'a, where 'b: 'a> FnOnce(&'a Foo<'b>)
as well, then changing Foo<'b>
from being invariant to being contravariant in 'b
would become a breaking change, because the body a closure implementing this trait bound could stop compiling if it relies on the 'b: 'a
bound. Turning a type from invariant into contravariant is not supposed to be breaking AFAIK, so that would be problematic.
So I guess the invariant case would need to become for<'a /* unbounded, for all 'a */> FnOnce(&'a Foo<'b>)
, too?
But then implementing such a closure could become next to impossible in the invariant case: If the closure body did not get to infer 'b: 'a
, then &'a Foo<'b>
might not be a well-formed type at all; so how are we able to have it in the signature, or have variables of the same type, or of related types in the closure body?
As far as I can tell, the proposed argument here might be that we want something like fn(&'a ()): 'static
to be universally true, which is something that I’ve seen in previous discussions, but as far as I remember it was more about a “fn
-pointers are special” kind of idea, and less of a “contravariant lifetimes are special” idea.
Even if we made sure that every contravariant (in 'b
) type Foo<'b>
would not require 'b: 'a
for &'a Foo<'b>
to be well-formed, the same will be impossible for invariant types Foo<'b>
, because those can actually really contain references of that specific lifetime.
For illustration: For example, if struct Struct<'b>(fn(&'b (), &'b i32)
(which is invariant in 'b
) did not require 'b: 'a
for a &'a Struct<'b>
reference type to be allowed, then either
impl Struct<'_> { fn bar(&self) -> &i32 { self.1 } }
method couldn’t be called on such a reference without requiring 'b: 'a
after all, in which case a for<'a> FnOnce(&'a Struct<'b>)
closure implementation that calls bar
(with the full lifetime of the &'a Struct<'b>
) would break / no longer compileimpl Struct<'_> { fn bar(&self) -> &i32 { self.1 } }
method could be called on &'a Struct<'b>
references without 'b: 'a
bound, but would most likely be unsound (unless other language changes would rule out all straightforward ways of exploiting this; though I don’t quite see how that could be done without yet-again causing breaking changes somewhere (for existing and reasonable code). WG-prioritization assigning priority (Zulip discussion).
@rustbot label -I-prioritize +P-medium
I don’t see good ways to change the current behavior here, but feel free to point out where I might not be seeing the way out.
Yeah, I agree. I do think this is a niche behavior that needs to be documented somewhere. I might take a crack at it.
I spent some time trying to create full-blown unsoundness example based on this bug. I. e. to create program, which will print garbage or something similar. I was unable to do so. It seems that described behavior is not a bug. It is simply deliberate design.
Currently existence of type &'a fn(&'b _)
implies 'b: 'a
(proof: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1e20272b307a10a4f4a3ae70c84e58dc ).
Moreover, currently fn(&'b _): 'a
implies 'b: 'a
(proof: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ff7d32ad000c129d098b4b1411b94552 ).
So, T: 'a
currently, in fact, does not mean "T
outlives 'a
". It means "all lifetime variables free in T
outlive 'a
". Personally I dislike this. I would not do this in my own hypothetical language. But it seems this decision already deeply baked into Rust. In particular, currently T: 'static
means that all lifetime variables free in T
outlive 'static
, i. e. T
does not have free lifetime variables. This means that if T: 'static
, then T
guaranteed not to have any embedded lifetimes, and thus it is guaranteed that type info for T
can be encoded without need to store lifetime information. As well as I understand, Any
relies on this.
So, this "bug" is not really bug, unless you want to redesign Any
.
But this means that we should very clearly describe meaning of T: 'a
in reference. So this is documentation bug
I also believe that we should document this behavior somewhere and there isn't anything we can change behavior-wise.
I started to write some docs for the reference about implied bounds https://github.com/rust-lang/reference/pull/1261. That documentation is fairly vague at times because implied bounds are an inconsistent mess and unsound, so documenting the status quo there is somewhat :/
What I am unsure about is where an in-depth documentation would be appropriate, the reference, nomicon, or dev-guide?
@rustbot label +A-docs
Potentially related to https://github.com/rust-lang/rust/issues/25860 . The issues seem related but the examples are pretty different so I figured I'd file this as separate.
I'm not yet sure if I can craft an actual self-contained soundness bug out of this, but it does cause a soundness bug in
Yoke
which relies onfor<'a>
working correctly: https://github.com/unicode-org/icu4x/issues/2926. I have multiple potential fixes for the yoke issue, but it's worth at least noting.TLDR: Rust currently correctly implies a
'inner: 'outer
bound when you have&'outer &'inner _
. It is also incorrectly implying this for&'outer fn(&'inner _)
The reduced example is as follows:
(playpen)
This should not compile.
F
requires the returned string to be valid for ALL lifetimes'all
, whereas we're returning a string of lifetime's
This code is behaving the same as this code:
where the concrete type of
F
isfor<'all> FnOnce(&'all &'co u8) -> &'all str
, and well-formedness implies that'all
must be shorter than'co
, so you end up getting a'co: 'all
bound, and's: 'co
ends up giving us's: 'all
, so even though there is afor<'all>
, the'all
is actually artificially restricted to being "for all shorter than'co
" and returning data of lifetime's
is valid there.(the
's: 'co
is actually not necessary forco_shorter()
to compile, but it's easier to explain that way)Basically, for
&'outer Foo<'inner>
, the implied bound probably should be:Foo
:'inner: 'outer
Foo
: NothingFoo
:'inner: 'outer
, or nothingBasically the implied bound is safe when things are guaranteed to not be contravariant. It's actually fine to have the implied bound for e.g. the invariant case of
&'outer Cell<'inner _>
, because a cell is logically a tuple of a covariant getter and contravariant setter, and the setter implies no bounds but the getter can imply bounds.This does give me some pause since typically we tend to believe that code that works on invariant types will also work on contravariant types, and here's a situation where it won't; so perhaps it's safer to also opt invariant lifetimes out of this. Idk.
Contravariant types could imply the bound'outer: 'inner
because function parameters can't live for less than the function does, but I'm not sure this implied bound is ever useful. Does lead to a nice symmetry though.The Yoke bug (https://github.com/unicode-org/icu4x/issues/2926) only shows up for contravariance due to how Yoke is structured. It's not clear to me that there's a way to write an unsafe abstraction that relies on
for<'a>
being truly universal where an implied bound coming from an invariant type may cause unsoundness. But I could be wrong.A further exploration can be found in this code (playpen):
Show code
```rust // A function that forces the closure to work for ALL lifetimes // ... or does it? };-) fn attachThe line
attach(contravariant_cart, |_| &s);
should not compile, nor should the functioncontra_shorter()
.contra_longer()
happens to correctly fail to compile because the same variance trick that savescovariant_cart_shorter
does not apply here.cc @pnkfelix @lcnr
Thanks to @workingjubilee for rubberducking this with me.