eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

CRIU: Debugging restored images #14237

Open tajila opened 2 years ago

tajila commented 2 years ago

Users will need to be able to debug restored images using CRIU. There are two ways of handling this:

Dynamic enablement of debug capabilities: With this approach we will dynamically enable debugging capabilities for a restored image upon user request. This approach has the following challenges:

Static enablement of debug capabilites: With this approach the user requests debug capabilities prior to image build time. This means that we can start the JVM with debug capabilites built-in. At restore time a debugger can be attached to the JVM process. This approach has the following drawbacks:

tajila commented 2 years ago

@gacholio @DanHeidinga any other thoughts on this?

DanHeidinga commented 2 years ago
  • we will need to fixup a bunch of method (all non-large stack) send targets to use J9_BCLOOP_SEND_TARGET_ZEROING

We'll also need to switch from the the regular stack mapper (which does forward flow analysis for live ranges) to the debug stack mapper (which does liveness more conservatively so it matches the debuggers expectations).

These two coupled together may mean that we have "garbage" (non zero) stack slots that suddenly appear live. This may be fixable (without having looked at the code) if we do a stack walk and zero all oslots identified by the debugger mapper that aren't also identified by the regular stack mapper.

We can probably handle this as a global exclusive action when restoring the image in debug mode

DanHeidinga commented 2 years ago

@tajila This looks like a good starting point on the different approaches. To really flesh out all the details though, we'll need to make a table that shows the different actions taken in the debug scenario vs the non-debug scenarios so we can figure out what the differences are.

If we can also add the jvmti capabilities supported by each of the different rows in the table, that would be an ideal data source.

Having it clearly specified that way may help us determine the set of capabilities to support in debug mode on restore (if we feel we don't need to support all of them), and may expose opportunities for other debug improvements (ie: late enablement of capabilities)

gacholio commented 2 years ago

I doubt this will work without significant effort. Also, any JIT code will need to be discarded because debug mode (FSD) code generation is completely different from the normal.