eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

[Discussion] Redesign StackWalker implementation on Continuation #16209

Open fengxue-IS opened 1 year ago

fengxue-IS commented 1 year ago

From discussions in #15181 and #15779 Currently VM's native stackwalker (swalk.c) uses stack allocated J9VMThread and J9VMEntryLocalStorage to hold data from J9VMContinuation when initializing a stackwalk. Stackwalker internally uses the vmthread and ELS to setup J9StackWalkState which holds all the necessary fields during walk.

In DDR, it is not possible to "stack allocate" structures so #15779 added new overloaded walkStackFrames entry function to initialize the walkstate values from a continuation and an interface to provide the data of ELS from either the thread's ELS or continuation directly.

The issue with existing design is that there is a misrepresentation of Continuation in DDR / natie as a vmthread/ELS structure and inconsistency between how DDR / native initializes walkState.


@keithc-ca Proposed a new approach which is to expand J9StackWalkState to directly hold all necessary fields during initialization instead of needing to access fields through the vmthread/ELS passed in, and avoid the need to stack allocate placeholder structures.

This would allow DDR to have a consistent design to the native impl and remove needs of allocating addition structures before stackwalk.

With a quick scan through the stackwalk code, the following fields need to be added to J9StackWalkState (may have missed some)

struct J9StackWalkState {
    ...
    j9object_t threadObject;
    UDATA privateFlags;
    J9JavaStack* stackObject;
    J9StackWalkState* stackWalkState;
    UDATA* jitGlobalStorageBase;
    UDATA* jitFPRegisterStorageBase;
    J9VMEntryLocalStorage* oldEntryLocalStorage;
}

@gacholio @0xdaryl @tajila for opinions

gacholio commented 1 year ago

I like this idea, though I'm not sure it will actually work given our dependence on the thread anchor. I would like to exploit this in ASGCT as well to avoid the need to modify the underlying J9VMThread.

If we eliminate the fake thread/ELS, the only possible value for thread in the walk is NULL - I'm hoping that APIs that use thread-local caches can be updated to handle a NULL thread and just avoid the cache entirely (this improves another awful ASGCT hack). Note we already alllow the current thread to be NULL and rely on the walk thread instead (which can now be NULL).

threadObject can not be copied here in general (may work in the specific cases). A better solution would be to query the thread for whatever attributes are needed and store them in the walk state instead of the object (which will be missed by any GC that occurs during the walk).

I'll give some more thought to handling ELS - we can't just store the old ELS once (there is a chain of ELS of unknowable size).

gacholio commented 1 year ago

What I would like to see is:

walkStackFrames() {
  initializeValuesFromJ9VMThread();
  commomWalker();
}

walkContinuationStackFrames() {
  intializeValuesFromContinuation();
  commonWalker();
}

This lets the existing APIs continue to work unmodified, and lets me use the J9VMThread initializer in ASGCT (I need to copy the J9VMThread values then modify them before walking the stack).

gacholio commented 1 year ago

We'll need i2jState and j2iFrame - basically anything in the continuation that came from the thread needs to be added (with the exception of the stack overflow marks).

gacholio commented 1 year ago

Actually, we already have the above fields.

I was concerned about unwinding ELS, but that's not an issue if we simply cache all of the relevant ELS values when we unwind (meaning we can get rid of the walkedEntryLocalStorage field).

So, I believe this will work assuming we can break the dependencies on having a non-NULL J9VMThread pointer.

jitGetExceptionTableFromPC will need to be updated to handle a NULL thread and bypass the cache.

thallium commented 1 year ago

meaning we can get rid of the walkedEntryLocalStorage field

here walkedEntryLocalStorage is being null-checked, how should this line be adopted if we want to get rid of the walkedEntryLocalStorage field? EDIT: also referenced here

gacholio commented 1 year ago

For the first case, I think if we NULL the jitGlobalStorageBase (and I suppose all other fields being copied from ELS) once we transition to the NULL ELS then one of those fields can be NULL checked instead. This is essntially dead code anyway (and appears to be incorrect as well).

The second is an extension that may no longer be used by anyone (though I don't know this for sure). I don't have a good solution for this one, so perhaps we just don't delete the field then neither of these places need to be changed.

tajila commented 8 months ago

@fengxue-IS are there still plans to complete this?

fengxue-IS commented 8 months ago

@fengxue-IS are there still plans to complete this?

Yes as will help to ensure the code design between VM and DDR is identical, but I think this should be a low priority item.