Open jakobbotsch opened 1 year ago
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak See info in area-owners.md if you want to be subscribed.
Author: | jakobbotsch |
---|---|
Assignees: | - |
Labels: | `area-CodeGen-coreclr`, `untriaged` |
Milestone: | - |
I have a branch where I started doing the work: https://github.com/dotnet/runtime/compare/main...jakobbotsch:runtime:type-addresses-of-locals-unmanaged?expand=1
Likely this would be easier when GT_ADDR
is removed.
cc @dotnet/jit-contrib
However, after speaking to @jkotas, we believe that implicit byrefs cannot actually point to heap today.
Is there a way to validate this believe? E.g., create an instrumented JIT that checks the addresses of all implicit byrefs and run our test suite?
Is there a way to validate this believe? E.g., create an instrumented JIT that checks the addresses of all implicit byrefs and run our test suite?
Yeah, seems like that wouldn't be too hard to do. It would probably be a good idea to validate it before committing too much to this.
If it isn't the case we are probably skating on thin ice, however, since Roslyn allows taking the address of parameters as unmanaged pointers. Not sure how this was reconciled in the past (maybe things were always pinned but write barriers were still needed for managed types, and TYP_BYREF
is necessary for this?)
I have looked at the history. We had number of issues around this in the original x64 port that were fixed later. I see the last set of fixes in 2009. If this text was written before 2009, it would explain where it came from.
Somewhat related is that can get some odd-typed trees today when inlining is involved:
STMT00051 ( 0x000[E-] ... ??? ) <- INL02 @ 0x000[E-] <- INLRT @ 0x0A1[E-]
[000177] -A--------- ▌ ASG long
[000176] D------N--- ├──▌ LCL_VAR long V10 tmp1
[000175] ----------- └──▌ SUB byref
[000173] ----------- ├──▌ FIELD_ADDR byref SequentialLayoutMinPacking`1[ushort]:_byte
[000174] ----------- │ └──▌ LCL_VAR_ADDR byref V04 loc1
[000172] ----------- └──▌ LCL_VAR_ADDR long V04 loc1
This is produced from https://github.com/dotnet/runtime/blob/e579ccb60acb044453f2cf1907482b165658488e/src/tests/Interop/StructPacking/StructPacking.cs#L1665-L1668
as part of inlining. The fact that the result of the SUB
is TYP_BYREF
seems moderately dangerous. As part of retyping local addresses as TYP_I_IMPL
we might need to investigate retyping inline arguments that are directly substituted in some cases.
I probably won't have time to do the full cleanup in 8.0, but we can at least get rid of write barriers for these cases.
I think this is ok because of the specifics about implicit byrefs (and current optimizations), but retyping/converting GC pointers is a particularly tricky area.
There are no conversions involved here -- the implicit byrefs passed are all already unmanaged pointers. If they weren't then we had very easily exposable GC holes because C# allows taking the address of an argument as an unmanaged pointer, and the JIT doesn't do anything defensively in this situation.
- In the linked abi, these is mention of the disabled optimization. Is disabling it an intended-to-be-permanent design choice, or a temporary mitigation of a problem? It's theoretically nice to be able to skip the copy into one of these temporary locations but probably limited in practice due to restrictions on ordering, introducing writes, etc. (also not sure why the optimization could only happen on return values and not incoming ones?)
Well the document is a bit confusing on the topic: the "disabled optimization" discussed seems to be more related to return buffers for reflection invoke. Even if that particular optimization was enabled the JIT would not be able to make use of it unless we also decided to disable our own optimization that passes GC pointers directly as the return buffer (see e.g. https://godbolt.org/z/P7nT17saE).
I don't think there is any problem here -- it's a trade-off. If we want to only have unmanaged ret buffer pointers then we can skip some GC reporting and write barriers, but we would not be able to do the optimization we do above (and reflection invoke would need to change). Whether or not this is the right trade-off I don't know.
also not sure why the optimization could only happen on return values and not incoming ones?
We would need to know whether the callee mutates the parameter to be able to avoid a defensive copy. And of course we would need to find a way to deal with the function taking the address of such a parameter. So it seems considerably harder than for the ret-buffer case.
I definitely agree that the idea here sounds fine - just wanted to get some notes in here because it's easy to go a little bit further and get into trouble.
C# allows taking the address of an argument as an unmanaged pointer
That's certainly problematic for those advanced kinds of optimizations. I guess the copy could be on the callee side, but...
Many of these are likely interprocedural analyses (so either AOT or more sophisticated JIT analysis, probably combined with tiering) so then you can just check these various things.
The JIT today represents local addresses as
TYP_BYREF
, in particular for implicit byrefs. However, after speaking to @jkotas, we believe that implicit byrefs cannot actually point to heap today. Previously this may have been the case for remoting/reflection. We should update the JIT to uniformly treat all local addresses asTYP_I_IMPL
and to type implicit byrefs asTYP_I_IMPL
, which should remove some GC reporting and write barriers we emit today.We should also be able to get rid of
impBashVarAddrsToI
and various places that we retype local addresses asTYP_I_IMPL
.Return buffers should stay
TYP_BYREF
typed as the JIT itself will pass heap pointers as return buffers sometimes.The following section in clr-abi should also be updated: https://github.com/dotnet/runtime/blob/3a3692029a2a038581fdc8c1c6496e0506a7e0c7/docs/design/coreclr/botr/clr-abi.md#amd64-only-by-value-value-types