Open gacholio opened 2 years ago
The basic relocation will be performed by generalizing the code which handles stack growing to separate the allocation and relocation of the stack.
Stack-allocated objects are walked by the stack walker using JIT metadata which details where each SA object beings in the stack frame. The walker uses the GC object iterator to then walk the individual slots of the SA object. None of the stack slots of SA objects are marked as an object slot for the walker.
For non-compressed refs, the existing relocation code will work unmodified.
For compressed refs, object slots in SA objects are only 32 bits wide, so they cannot simply be relocated. With compressed refs shift, every bit in the slot may be used, so a slot tagging solution is not appropriate.
During the stack relocation, I suggest we convert the compressed object pointer slots to uncompressed stack offsets (which will always fit in 32 bits). walkContinuationStackFrames
will need to pass a new flag into the stack walker instructing it that the SA slots are offsets.
There is an issue with the stack offsets - how to distinguish between slots which contain offsets (used to point to an SA object) and slots which contain heap object pointers. With no tag bits available, we may need to keep a separate bitmap to indicate which slots contain offsets.
The bitmap will be placed after the end of the copied stack (in the same allocation). As an optimization, the copied stack and bitmap should exclude the unused portion of the stack.
Continuing on with the code, I notice the stack grower relocates arraylet leaves found in stack-allocated objects. I'm not sure yet that the offset strategy will work with these.
@0xdaryl Are there ever in fact stack-allocated arrays in arraylet GC policies?
Are there ever in fact stack-allocated arrays in arraylet GC policies?
There might be. It isn't clear to me from a simple inspection of the code. x86, for example, sets a codegen flag [1] to permit stack allocation of arraylets and this is checked in the EA candidate-finding loop so there may be cases where contiguous arrays are stack allocated. Large, discontiguous arraylets are never stack allocated, however.
@hzongaro : can you fill in any more details here?
So-called contiguous arrays do not contain a spine (where the arraylet pointers are). You say above that large discontiguous arrays aren't stack allocated - what is considered large? The easiest thing to do here would be to simply disallow stack allocation of arrays that require a spine, regardless of the total size.
My concern is that j9mm_iterate_object_slots
(used to walk the slots of stack-allocated objects) will not be able to walk the spine of an array that has been relocated to the high memory area.
Yes, Escape Analysis will stack allocate contiguous arrays under the GC policies that allow for arraylets. It relies on J9::Compilation::canAllocateInline
to make that determination. That method imposes a limit on the number of elements of 0xFFFFF, but it also checks whether the total size of the array would result in its being discontiguous.
Thanks - this means I can remove the walking of the spine completely from the stack grow code since it can never occur and not worry about it for this design.
@fengxue-IS Here's the first cut of the code: https://github.com/gacholio/openj9/tree/loom
A few things to note:
In JDK19 you've added a field to the stack header. You could change this to a bit field and store a flag in there indicating that the stack is for an unmounted continutation. This would simplify some of the code paths.
Stacks allocated for unmounted continuations must currently be freed using the port library directly. If the flag above is added, this would no longer be true.
I'm zeroing the stack for unmounted when it's allocated - really only the trailing bitfield needs to be zeroed.
Added a new flag for the stack walker which would also not be needed if the stack is tagged with the new flag.
If running compressed references, when a continuation stack is unmounted, it should be copied into new memory not allocated in the <4Gb area. When remounted, copy the stack back to the low memory area.
The complication is stack-allocated objects which point to other stack-allocated objects. Because the reference slots in the objects are only 32 bits wide, they can't simply be relocated.