Open oli-obk opened 6 years ago
Ok, just had a quick conversation with @eddyb on discord.
We also need to prevent
&AtomicUsize::new(42)
in constants (because using such a constant would possibly only duplicate the reference, not the atomic value)
(T::ASSOC_CONST, 42).1
inside polymorphic code if ASSOC_CONST
is a value with a Drop
implBasically anything other than checking the body of a constant won't be useful for polymorphic code. So we'd still not be able to remove the qualification as it exists today.
To make the qualification smarter and generally be able to understand conditional code and loops and such we need to implement the qualification in terms of dataflow
We rule out UnsafeCell
values because the same promoted will be used multiple times when this code is executed several times, and of course that's observable once mutation enters the picture.
We rule out Drop
values because... it's not okay to not run the destructor, because the user didn't actually ask for a static
?
To make the qualification smarter and generally be able to understand conditional code and loops and such we need to implement the qualification in terms of dataflow
This only applies to promotion, right? I'd not do dataflow. I'd just say we never promote once control flow is involved.
From what I understand, we still only look at the type when checking if a use of a const fn
can be promoted, and the resulting value for const
.
so const FOO: usize = if false { 42 } else { 45 };
won't be usable in let x: &'static usize = &FOO;
?
Oh are you saying that we have trouble with promotion even when we are just using a const
? Dang...
For const
however a value-based analysis seems reasonable, does it not? The value should be the only thing of a const
that is actually observable.
How to determine drop/interior mutability of a value is another question. We could traverse the value guided by the type (exactly like the sanity check does) and look for it? This has to stop at unions, but we do not promote code that reads from unions so I think we are good.
If the constant is an associated constant we're out of luck wrt promotion checking the value. But I think we currently just pessimistically assume drop and not_freeze, so it could be fine.
True, the best we can do at that point is fall back to its type (insofar that is known).
But I see no reason, so far, to ever look at the body of a constant (as opposed to its computed value).
More recently, I came to the conclusion that the code is spaghetti because it tried really hard to avoid recomputing things, so it ends up with one "qualif" that's modified in a linear pass.
If rewritten into several modular analyses, and one const checker (the only part that would actually emit errors), we can probably have a cleaner setup overall. And we could even mix polymorphic MIR analysis with post-eval value analysis.
E.g.
(UnsafeCell::new(42), 42).1
is treated as if you end up with an UnsafeCell value
That shouldn't be true, I don't think. Only unions have problems like these, structs should be fine.
That shouldn't be true, I don't think. Only unions have problems like these, structs should be fine.
The MIR const qualifier does type-based reasoning on projections of pairs, but the old HIR one not.
@oli-obk Unions are currently feature-gated in const fn
s but are stable in other const contexts, was this intended?
union SignedToUnsigned {
signed: i32,
unsigned: u32,
}
// allowed on stable
const UNSIGNED: u32 = unsafe { SignedToUnsigned { signed: 1i32 }.unsigned };
// requires `#![feature(const_fn_union)]`
const fn signedToUnsigned(signed: i32) -> u32 {
unsafe { SignedToUnsigned { signed }.unsigned }
}
@abonander yes. It is too late to gate them in const
. And they also do less harm there because we can do sanity checks on const
to ensure they match their given type, which is not possible for const fn
.
@RalfJung can you elaborate on that last part? Specifically, anything that would preclude something like #54678 where both types in the union are the same size?
For const
we have a "sanity check" that looks at the final computed value and its type, and makes sure the value is reasonable at the given type. So even if people do horrible things with unions, we can usually tell them that that might be a bad idea. However, for const fn
, we'd have to do that check for ever possible argument, which is clearly not possible.
So I tried out a few things and looked at the corresponding qualification code. We are maximally pessimistic about associated constants if their type could possibly allow Drop
or UnsafeCell
. This means that we assume the type allows both of them unless proven otherwise.
E.g.
trait Foo<T> {
const X: T;
}
trait Bar<T, U: Foo<T>> {
const F: u32 = (U::X, 42).1;
^^^^^^^^^^ destructors in constants are not allowed
}
does not compile, since we assume that the tuple has nontrivial drop glue, because U::X
might have such drop glue.
A value based analysis could not evaluate F
, because that would still be generic. Thus we'd not be able to report an error where we are currently reporting one. This would mean that the moment someone monomorphizes this trait, the constant would get evaluated and we'd possibly get an error then (depending on whether the type "needs drop" or not). For example:
fn foo<T, U, V: Bar<T, U>>(t: T, u: U, v: V) { ... }
would error if the concrete type given for T
needs drop.
I hope it makes sense now why we can only do a value based analysis in case there are no generics involved
While monomorphization time errors are not great, we already have a bunch of them from const eval failures:
trait Foo<T> {
const X: T;
}
trait Bar<U: Foo<usize>> {
const F: u32 = [42][U::X];
}
struct Struct;
impl Foo<usize> for Struct {
const X: usize = 99;
}
impl Bar<Struct> for Struct {}
fn main() {
// comment out to not get any errors
let x = Struct::F;
}
This isn't an argument for just opening the door to allowing all kinds of post monomorphization errors. Instead I want us to think about this topic to find out a comprehensible rule for where we do allow them and where not.
One thing that I also find problematic is that we have split the polymorphic error checking into two parts: const qualification and const propagation. These are actually duplicating each other's work to some degree.
This also overlaps somewhat with symbolic miri, because we could formulate all of these analyses as a single symbolic evaluation.
I think the approach outlined in https://github.com/rust-rfcs/const-eval/issues/17 would greatly help me understand const qualification: It is a static analysis approximating some dynamic property, and being more explicit about that property (specifying it precisely and maybe even implementing it) would help tremendously.
We can indeed rewrite everything as a dynamic check, but we'd lose all pre-monomorphization errors.
Const qualification is essentially a very limited form of symbolic execution (limited as in "doesn't know about control flow or loops") that solely operates on symbols for "drop", "constness" and "internal mutability" instead of operating on values.
We can indeed rewrite everything as a dynamic check, but we'd lose all pre-monomorphization errors.
Nono, that's not what I meant.
I am saying we should have both a dynamic check and static const qualification. The dynamic check should be sound, and it should be documented at https://github.com/rust-rfcs/const-eval/ which checks are performed and why. The static check should then make sure that the dyanmic check never triggers.
Basically, we already have in your (and @eddyb's) head a dynamic version of this check. It's the dynamic property that the static checks you are writing test for (as an approximation of course because halting problem). I am saying let's get that check out of your heads and into the miri engine.
I've been having some zulip conversations with @eddyb regarding dataflow-based const qualification.
A problem quickly arises when trying to fit const qualification into the existing dataflow framework: namely, how to handle a definition of one local that has a data dependency on another (e.g. _1 = _2
). We'd like to simply copy the qualifications for _2
into _1
, but this is not expressible as a "gen/kill" operation. Trying to extend the existing framework with an assignment operator also presents a few problems, like how to handle aggregates (e.g. _1 = [_2, _3]
) and ordered data dependencies (e.g. _1 = _2; _2 = _3; _3 = _1
) while still guaranteeing that dataflow analysis converges. I believe that the |=
operator required to handle these cases is mutually exclusive with the ability to coalesce transfer functions for all statements in a basic block. This is one of the properties that makes the "gen/kill" framework efficient, so implementing a separate dataflow engine would be necessary. #35608 contains a suitably generic one.
However, I believe that const qualification can be done in the existing framework using a reaching definitions analysis (@eddyb is skeptical of this; I'm hoping they can expound a bit in this thread). Furthermore, I believe that this formulation is equivalent to the one that requires |=
above. Basically, we change the universe of the dataflow analysis from the set of locals (and their qualifications) to the set of definitions of locals. This information can be processed into a use-def chain (I've implemented this in #62547).
To explain how qualification actually works with a use-def chain, I'll use the HasInteriorMut
qualifier as an example. Every definition of a local can be classified as unconditionally qualified (e.g. _1 = Some(Cell::new(5))
), unconditionally not qualified (e.g. _1 = None
), or conditionally qualified dependent on some set of definitions (e.g. _1 = _2
is dependent on all definitions of _2
that reach that point in the program). Qualification can then be expressed as a DFS over this use-def chain that iterates over dependent definitions and terminates when in finds an unconditionally qualified one.
A cyclic data dependency (e.g. loop { x = x + 1}
), will present as a dependency on a definition that already exists on the DFS stack. We can safely ignore such dependencies because they are tautological; they express that x = x + 1
is qualified implies x = x + 1
is qualified. This approach extends to cycles of arbitrary length.
I believe that the approach described at the start of the post (I'll call it "copy-qualifs") is semantically equivalent to the one based on the use-def chain. Whereas the use-def chain explicitly stores the set of all definitions that could reach each use of a local, "copy-qualifs" stores only the union of the qualification bits of all those reaching definitions. Basically, the use-def chain approach is two separate steps: compute the data dependencies for each definition of a local then propagate qualifications between them, while the "copy-qualifs" approach coalesces this into a single operation, avoiding the need to actually store the data dependencies. The downside of not saving the intermediate result is that "copy-qualifs" must run a separate dataflow for each qualification type, while the use-def chain only needs to run one.
I want to keep driving this forward, but I would like some more input from wg-const-eval
people first.
@oli-obk @RalfJung @eddyb
A cyclic data dependency (e.g.
loop { x = x + 1}
), will present as a dependency on a definition that already exists on the DFS stack. We can safely ignore such dependencies because they are tautological; they express thatx = x + 1
is qualified impliesx = x + 1
is qualified. This approach extends to cycles of arbitrary length.
I would prefer a fixpoint/saturation mechanism, that dyanmically checks the monotonicity here, instead of assuming it and risking some kind of subtle bug.
I'm working on a little library for doing memoized DFS with cycle-handling, coincidentally, maybe we should use that instead of implicit assumptions?
We (well, @eddby, but I implemented it) found a way that doesn't require a dataflow analysis: If a local variable in a const/static item's initializer has no StorageDead
, then that means that it is a temporary in the final expression of the initializer and thus follows the "escaping scope" rule and gets interned. Thus we are allowed to reference it (otherwise borrowck would fail anyway). This way, even if the value has interior mutability, as long as the local that gets borrowed has a StorageDead
, we know it won't end up in the final value of the constant and thus borrowing that interior mutable local is totally fine.
For the implementation see https://github.com/rust-lang/rust/pull/80418
If a local variable in a const/static item's initializer has no StorageDead, then that means that it is a temporary in the final expression of the initializer and thus follows the "escaping scope" rule and gets interned.
So far, the argument as I understood it was the other way around: if a local variable has a StorageDead
, it is certainly not part of the final value since it will be deallocated first. I find that argument much easieser to believe than the one you phrased.
cc @eddyb @RalfJung
Let's figure out what exactly we need, because I'm very confused about it.
So, currently const qualification is doing a few things at once:
Drop
orUnsafeCell
types (None
is always ok, even ifSome(value)
would not be due to the type, even ifOption<T>
would technically propagate said type.const fn
andconst
forDrop
andUnsafeCell
value creations, even if it is ignored for the constant itself and only checked when the constant is used in a value checked for promotionWhy I want to stop looking at bodies, and instead just check the final value of constants:
(UnsafeCell::new(42), 42).1
is treated as if you end up with anUnsafeCell
valueconst fn
's body is not allowed to change, even for changes which would not change the final value for any input, because such a change might subtly lead to the analysis suddenly thinking there's anUnsafeCell
in the final valueWhy we cannot just look at the final value right now:
Solution brainstorm:
const fn
if its return type may containUnsafeCell
orDrop
. SoOption::<String>
is not promoted, even if the actual value isNone
. (not a breaking change, since there are no stable const fn for which this could break any code)