Open mshinwell opened 4 weeks ago
I'm not sure it causes any issues currently, but I want to clarify what the region is for:
Not pushing %int_of_pointer past an end-region is actually a proxy for the real property we are trying to ensure, which is that the GC can never see the result of this primitive
The aim is to ensure that the GC can never see the result of this primitive after the end of the region. It is safe for the GC to see the result before then, and indeed I would expect it to do so in a lot of cases where we pass the value into some other function that takes a local argument. It is also safe for the GC not to look at the result at all -- so having it be Addr
should be safe -- but it is only unsafe for it to look after the region.
Note that this means you can get code like:
type t = { res : int }
let id x = x
[@@inline never]
let foo x =
let y = int_as_pointer x in
let z = id y in
z.res
let bar arg =
let a = malloc arg in
let res = foo a in
free a;
res
where it is important that the GC does not look at z
after the call to free
. This probably means that the Addr
is not sufficient to prevent problematic optimizations (which I also think don't currently exist). Perhaps something that instead marks the Begin_region
and End_region
constructs as "ghost" would be safer?
Ah yes, indeed these cases with free
won't be caught by the Addr
thing (not that this invalidates the remainder of this patch I don't think).
I have wondered about keeping the regions right until the end of Cfg, at which point they could be deleted. However this requires changing the Mach/Cfg operations to hold onto the region registers somehow (and also to add an int_as_pointer Cmm primitive etc). It seems like that could maybe affect code generation, though I would have to discuss with Greta and Xavier and think about this further.
@chambart suggested to use other variables with kind Region (I think it's also what Leo suggested above). Every function would have two region parameters: one for stack allocations, and one for scoping only. When translating from lambda, we would insert Begin_region
and End_region
twice for each lambda region, with a flag on one of them marking it as virtual. Allocations would point to the current non-virtual region, Int_as_pointer
to the current virtual region, function applications to both.
We would then be able to track both regions independently, and the virtual region instructions would be compiled away somewhere in the backend.
See #2682
This PR aims to correct bad code generation when
%int_as_pointer
is used at local mode, in a context where the corresponding region is not required for any actual stack allocation. For example in the following (which can be compiled with the boot compiler using-nostdlib -nopervasives
):we would like it to be the case that the code for
set
is the same as fornormal_set
except for one subtraction (corresponding to%int_as_pointer
). This is the case after this PR. At the moment it is not the case, with the generated code loading and storing the local stack pointer.The
%int_as_pointer
primitive should be (and is) treated in the main part of flambda2 (everything beforeTo_cmm
) a bit like a zero-sized stack allocation. The aim is to ensure that the primitive is not pushed past the end-region, just like any other local allocation. (At present I think this is really enforced by the fact thatEnd_region
is a hard barrier inTo_cmm
, but with a more advanced effect analysis in that pass in the future, the region variable named by theInt_as_pointer
primitive will ensure this.)Not pushing
%int_of_pointer
past an end-region is actually a proxy for the real property we are trying to ensure, which is that the GC can never see the result of this primitive (as it is untagged and could cause misbehaviour, especially with a no-naked-pointers compiler or runtime5).The idea of this patch, then, is to delete "unused" regions during
To_cmm
. The definition of "unused" here is "not needed to do any stack allocation", so in particular, a region only held onto viaInt_as_pointer
is to be deleted. Now it seems like there might a risk of code motion transformations (now or in the future) in the backend needing the region in order to ensure soundness. Indeed, there is arguably a pre-existing problem here, becauseInt_as_pointer
is expanded into Cmm code that does not name any region variable duringCmm_helpers
. This patch aims to adopt a straightforward approach which I hope is going to work when tested on a large scale: make that Cmm code have return machtypeAddr
. This in theory would allow the backend to move the primitive past anEnd_region
, although I don't think there is currently any code in the compiler which would do this, but it does ensure (at compile time) the above property relating to the GC. This is the critical part.The patch itself is fairly easy. There are probably three PRs here to be split out:
Cmm_helpers.int_as_pointer
return something of machtypeAddr
.To_cmm
traverse the flambda2 terms in dominator order, so something straightforward can be done in point 3 below. I think this is probably a good change anyway. Some more thinking needs doing about the handling of loops (especially in conjunction with point 3. below), I just haven't had time yet.To_cmm_result
which regions have been seen to be used, and don't emit Cmm code forEnd_region
primitives if the corresponding region is unused. It's a bit unfortunate that this seems to involve passing theSimple.t
for the argument to the unary primitive simplification function, but this does seem to be much better than trying to do this based on the Cmm variables (asIdent.t
). I tried that latter way and it got complicated very quickly.I think the
%int_of_pointer
test at local mode in the testsuite may not really be testing the right thing, but I will discuss with @riaqn .