Closed nikomatsakis closed 6 years ago
We should have the right sort of data in flow_inits
, and I don't see why can't we have liveness fragment-by-fragment: structs with destructors can't be partially initialized anyway - see nikomatsakis/nll-rfc#42.
Let me leave some notes on how things work now. I've been digging about in the code a bit to get a better understanding of what's going to be required here. The NLL code approaches things just a hair differently from the existing borrowck data-flow operator, but I think the two should match up well enough.
First, a bit of background. In order to track which paths are moved etc, we currently construct up a tree of paths. Each such path is called a MovePath
, and they are indexed by the MoveData
struct. Typically though we don't pass around a MovePath
directly, [but rather a MovePathIndex
.
The move paths are arranged in a tree, so e.g. if you have a move path a
, as well as a.b
and a.c
, you can walk from a
to a.b
and a.c
-- basically, each move data has a link to its child (first_child
) and then that child has links to its siblings (next_sibling
). The helper has_any_child_of
, for example, shows how to iterate through all the descendants of a node using these fields. (We'll need a similar helper, I think, but implemented on FlowInProgress<MaybeInitializedLvals<'b, 'gcx, 'tcx>>
instead of FlowInProgress<MaybeUninitializedLvals<'b, 'gcx, 'tcx>>
-- we should be able to share this code.)
It is instructive to look at where has_any_child_of
is used today, since it is doing a similar thing. The check_if_path_is_moved
function is invoked when there is an access to a path like a.b.c
-- it checks whether that path is accessible or not. So, for example, it would be illegal to do some_function(a.b.c)
if a.b.c.d
has been moved, since a.b.c
is not a "complete" value (it's also illegal to do some_function(a.b.c)
if a
or b
has been moved).
In order to detect the case where a.b.c
or some subpath of has been moved, the code works in two steps:
First, it looks for a move-path for a.b.c
, on line 738. There may be no such path, if the function only moves a
or a.b
(but never directly accesses the field a.b.c
). If there is such a path, then on line 739 it looks for all maybe-uninitialized children of that path using the has_any_child_of
function we saw earlier.
First, we are going to have to do a bit of surgery to the current control-flow, I think. Right now, the borrowck begins by invoking gather_moves
as the first step, and then invokes the compute_regions
call later on. This is actually a mistake, I believe, and we should be invoking compute_regions
earlier -- we don't want to be doing the move analysis on the original MIR and then editing it, since the types and things will be changing, which could mess up some of the hashing etc.
Probably, also, it would be useful to make borrow_check
into a directory, move the existing code into borrow_check/mod.rs
, and move nll.rs
from transform::nll
into borrow_check::nll
. That might be a good first PR.
Next, I think what we really want is to split up the compute_regions
code into two steps: the "renumbering", which is what mutates the MIR, and then everything else. That way, the overall flow of borrowck can be:
compute_regions
, which will need access to the maybe-initialized bits
to handle the drop case is something similar. Basically, let's say that liveness shows that some variable x
is "drop-live" at some point P. Then we would want to find the move-path-index for x
-- note that we always have a move-path-index for variables, so we can just do `move_data.move_path_index(lcompute_regions
OK, ran out of time, will write more later!
I'm taking this one :smile:
@Nashenas88 wound up doing this, but it's done. There's follow-up work to be done, I'll file a separate issue for that.
As part of https://github.com/rust-lang/rust/pull/45538, we are incorporating knowledge of which variables are live (i.e., may be used later). We also distinguish "drop-live", which means that they may be dropped later.
However, in MIR, we sometimes generate drops that we can see statically will be a no-op. For example, consider this program:
The problem is that
p
is considered "drop-live" at the point wherev += 1
, because there is a DROP in the MIR, even though we know thatp
must have been moved (and hence will not be dropped) at that point.One way to account for this is to do a maybe initialized analysis. This is a forwards analysis that tells you, at any given point, whether a given value could possibly be initialized that this point. The idea then would be to change the code which invokes the
add_drop_live_constraint
function to take the "maybe initialized" state into account. In particular, if the local variable that is being dropped cannot be initialized (i.e., maybe initialized is false), then we can ignore the fact that it is drop live and just not invokeadd_drop_live_constraint
with its type.We are already computing a "maybe initialized" in the borrow checker (the variable
flow_inits
). However, this analysis works a bit differently than the NLL code. For example, it is computed at the level of fragments, which are paths likea.b.c
. This is done because it is possible thata
was initialized, but some subpart of it (e.g.,a.b.c
) was moved, while the rest remains intact.Honestly, all the data structures are a bit out of cache for me, but the high-level idea then is that, when we know that a variable
X
is drop-live, we want to iterate over all of the subpaths ofa
that may be initialized (which the existing data structures should be able to tell us) and then invoke the existingadd_drop_live_constraint
function on the types of those fragments.cc @arielb1 @pnkfelix -- any notes you care to leave on how to find the maybe initialized fragments given a
mir::Local
would be appreciated. Else I will dig into the data structures at some point.