Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k
stars
721
forks
source link
Design and Implementation of Foreign Linker API: Upcall #15068
The upcall of JEP389/412/419/424 which is built on top of our downcall implementation (please see https://github.com/eclipse-openj9/openj9/pull/12413 in details) aims to offer a solution from native to java to inter-operate with the java code by invoking the target method handle (based on the OpenJDK MH implementation in OpenJ9) via the thunk/dispatcher calls in native.
The following typical example shows how it works in upcall:
1) Java code for preparing the upcall stub and invoking the upcall MH
int add2IntsByUpcallMH(int intArg1, int intArg2, int (*upcallMH)(int, int))
{
int intSum = (*upcallMH)(intArg1, intArg2); the thunk/dispatcher call into the interpreter
return intSum;
}
3) Java code for the target MH in upcall
public static final MethodHandle MH_add2Ints = MethodHandles.lookup().findStatic(Example.class,
"add2Ints", MethodType.methodType(int.class, int.class, int.class));
public static int add2Ints(int intArg1, int intArg2) {
int intSum = intArg1 + intArg2;
return intSum;
}
where the steps of the upcall invocation are as follows:
[1] prepare the down method handle to invoke the native function in downcall.
[3] invoke the downcall method handle by passing the arguments of upcall method plus the native function symbol which is a wrapper of the thunk address intended for upcall.
int result = (int)mh.invoke(111112, 111123, upcallFuncAddr);
In the example above, invoking the native function add2IntsByUpcallMH via ffi_call in downcall triggers a thunk/dispatcher call into the interpreter with the target MH plus the marshalled arguments placed as arguments on the java stack so as to invoke the upcall method.
High Level Design:
To achieve this, the design is categorized into four parts:
[1] creating the upcall stub
The metadata at the java level mainly consists of the target MH plus the type for upcall and a 2-element cache array containing MemberName (intended to specify the upcall method) and appendix (specifically the method type). The target MH is resolved in native ahead of time where the resolved data are cached back to the cache array in java prior to the upcall. As such, the cached data are extracted from the metadata when calling into the interpreter to obtain the upcall method and the method type required for invoking the target MH.
class UpcallMHMetaData {
private MethodHandle calleeMH;
private MethodType calleeType;
private Object[] invokeCache; // memberName and appendix
...
}
The metadata is stored as a global reference in J9UpcallMetaData for access to the target method handle and the cache array when they are manipulated for the target MH resolution and the thunk/dispatcher calls in native.
typedef struct J9UpcallMetaData {
J9JavaVM *vm;
jobject mhMetaData; /* A global JNI reference to the upcall hander plus the metaData for MH resolution */
void *upCallCommonDispatcher; /* Which icallVMprJavaUpCall helper to be used in thunk */
void *thunkAddress; /* The address of the generated thunk to be generated by JIT */
UDATA thunkSize; /* The size of the generated thunk */
J9UpcallNativeSignature *nativeFuncSignature; /* The native function signature extracted from FunctionDescriptor */
UDATA functionPtr[3]; /* The address of the generated thunk on AIX or z/OS */
} J9UpcallMetaData;
With the passed-in target MH and the created metadata in java, a upcall stub (technically a wrapper of the thunk address) is created by generating a thunk uniquely associated with the corresponding upcall metadata (specifically UpcallMHMetaData) stored in a hashtable in java which helps to avoid creating duplicated thunk for the same target method handles within the same source scope (especially in terms of multi-threading) by leveraging the existing thunks in the hashtable as long as the surrounding resource scope is alive.
[2] encoding the native signature specific to platforms (required prior to the thunk generation)
The disparity of the passed-in argument types (obtained from the function descriptor) across platforms requires to differentiate the combination of types (especially in terms of struct) in native to help the thunk to cope with the passed-in arguments correctly. To decouple the dependency from the representation of the function descriptor (which keeps evolving in JEPs) and simplify the handling process, the function descriptor must be preprocessed in java at first by converting to a native signature string combined with simplified type symbols. With this native signature string extracted in native, each struct type is further converted to a 16-byte composition type array (mapping to the first 16 bytes of struct in size) which is encoded as a struct specific type to be recognized by the thunk. Subsequently, the thunk takes advantage of the struct type to determine how to marshall the corresponding argument of the upcall method.
[3] allocating memory to generate the thunk
Triggered from the native function in downcall, a thunk is used to call the specified dispatcher (determined by the return type during the thunk generation) with the marshalled arguments so as to invoke the target MH by calling into the interpreter. To do so, the thunk will be generated from a heap (allocated from a fixed page size of virtual memory) in advance when creating the upcall stub. The generated thunk address plus the corresponding upcall metadata (specifically J9UpcallMetaData) is stored in a hashtable in native to guarantee that the memory resources for the thunk and the upcall metadata are automatically released by OpenJDK when the recourse scope is terminated.
[4] dispatcher call to the target handle in the interpreter via the thunk
Serving as a bridge between the thunk and the interpreter, a dispatcher (determined by the return type of upcall method) mainly takes responsible of handling the passed-in arguments and the return value from the target MH as follows:
1) arrange all arguments (marshalled by the thunk) on the java stack against their types before calling into the interpreter, especially encapsulating a raw pointer/struct to an object required by the upcall method in java
2) convert the return value from the target MH/interpreter against its type before returning to the native function in downcall where the thunk is triggered, especially extracting the raw value (which is only accepted in the native function) from the wrapped pointer/struct object.
After calling into the interpreter, the transition code in the interpreter set up the upall method (specified by MemberName) and its appendix placed on the java stack to be ready for execution in the interpreter.
Code changes in implementation:
With the design explained above, the source/files intended for upcall mainly include (except the majority of internal methods/functions and headers):
[1] the creation of the upcall stub along with the metadata in java & native
Description:
The upcall of JEP389/412/419/424 which is built on top of our downcall implementation (please see https://github.com/eclipse-openj9/openj9/pull/12413 in details) aims to offer a solution from native to java to inter-operate with the java code by invoking the target method handle (based on the OpenJDK MH implementation in OpenJ9) via the thunk/dispatcher calls in native.
The following typical example shows how it works in upcall: 1) Java code for preparing the upcall stub and invoking the upcall MH
2) Native code in downcall
3) Java code for the target MH in upcall
where the steps of the upcall invocation are as follows: [1] prepare the down method handle to invoke the native function in downcall.
[2] prepare the upcall stub by resolving the target MH and preprocessing the corresponding function descriptor so as to generate the thunk
[3] invoke the downcall method handle by passing the arguments of upcall method plus the native function symbol which is a wrapper of the thunk address intended for upcall.
In the example above, invoking the native function
add2IntsByUpcallMH
viaffi_call
in downcall triggers a thunk/dispatcher call into the interpreter with the target MH plus the marshalled arguments placed as arguments on the java stack so as to invoke the upcall method.High Level Design: To achieve this, the design is categorized into four parts:
[1] creating the upcall stub The metadata at the java level mainly consists of the target MH plus the type for upcall and a 2-element cache array containing
MemberName
(intended to specify the upcall method) andappendix
(specifically the method type). The target MH is resolved in native ahead of time where the resolved data are cached back to the cache array in java prior to the upcall. As such, the cached data are extracted from the metadata when calling into the interpreter to obtain the upcall method and the method type required for invoking the target MH.The metadata is stored as a global reference in
J9UpcallMetaData
for access to the target method handle and the cache array when they are manipulated for the target MH resolution and the thunk/dispatcher calls in native.With the passed-in target MH and the created metadata in java, a upcall stub (technically a wrapper of the thunk address) is created by generating a thunk uniquely associated with the corresponding upcall metadata (specifically
UpcallMHMetaData
) stored in a hashtable in java which helps to avoid creating duplicated thunk for the same target method handles within the same source scope (especially in terms of multi-threading) by leveraging the existing thunks in the hashtable as long as the surrounding resource scope is alive.[2] encoding the native signature specific to platforms (required prior to the thunk generation) The disparity of the passed-in argument types (obtained from the function descriptor) across platforms requires to differentiate the combination of types (especially in terms of struct) in native to help the thunk to cope with the passed-in arguments correctly. To decouple the dependency from the representation of the function descriptor (which keeps evolving in JEPs) and simplify the handling process, the function descriptor must be preprocessed in java at first by converting to a native signature string combined with simplified type symbols. With this native signature string extracted in native, each struct type is further converted to a 16-byte composition type array (mapping to the first 16 bytes of struct in size) which is encoded as a struct specific type to be recognized by the thunk. Subsequently, the thunk takes advantage of the struct type to determine how to marshall the corresponding argument of the upcall method.
[3] allocating memory to generate the thunk Triggered from the native function in downcall, a thunk is used to call the specified dispatcher (determined by the return type during the thunk generation) with the marshalled arguments so as to invoke the target MH by calling into the interpreter. To do so, the thunk will be generated from a heap (allocated from a fixed page size of virtual memory) in advance when creating the upcall stub. The generated thunk address plus the corresponding upcall metadata (specifically J9UpcallMetaData) is stored in a hashtable in native to guarantee that the memory resources for the thunk and the upcall metadata are automatically released by OpenJDK when the recourse scope is terminated.
[4] dispatcher call to the target handle in the interpreter via the thunk Serving as a bridge between the thunk and the interpreter, a dispatcher (determined by the return type of upcall method) mainly takes responsible of handling the passed-in arguments and the return value from the target MH as follows: 1) arrange all arguments (marshalled by the thunk) on the java stack against their types before calling into the interpreter, especially encapsulating a raw pointer/struct to an object required by the upcall method in java 2) convert the return value from the target MH/interpreter against its type before returning to the native function in downcall where the thunk is triggered, especially extracting the raw value (which is only accepted in the native function) from the wrapped pointer/struct object. After calling into the interpreter, the transition code in the interpreter set up the upall method (specified by
MemberName
) and its appendix placed on the java stack to be ready for execution in the interpreter.Code changes in implementation: With the design explained above, the source/files intended for upcall mainly include (except the majority of internal methods/functions and headers): [1] the creation of the upcall stub along with the metadata in java & native
[2] the generation of the native signatures for thunk
[3] the memory allocation and generation of thunk
[4] the dispatcher & the transition code from the native to interpreter
Supported platforms (with the thunk generation code offered by JIT):
Note: The description above include the latest update in
java.base
mentioned in https://openjdk.java.net/jeps/424 (Preview) in OpenJDK/Java19.References: [1] https://openjdk.java.net/jeps/389 [2] https://openjdk.java.net/jeps/412 [3] https://openjdk.java.net/jeps/419 [4] https://openjdk.java.net/jeps/424 [5] https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md