Open Spencer-Comin opened 1 year ago
The current instruction stream generated for interface dispatch looks like
n6n ( 0) ResolveAndNULLCHK on n40n [#32] [ 0x3ff2c404800] bci=[-1,3,19] rc=0 vc=169 vn=- li=2 udi=- nc=1
n5n ( 1) icalli InterfaceTest$X.foo()I[#398 unresolved interface Method] (Interface class) [flags 0x400 0x0 ] (in GPR_0035) [ 0x3ff2c4047b0] bci=[-1,3,19] rc=1 vc=169 vn=- li=2 udi=- nc=2
n4n ( 0) aloadi <vft-symbol>[#323 Shadow] [flags 0x18607 0x0 ] (in GPR_0033) [ 0x3ff2c404760] bci=[-1,3,19] rc=0 vc=169 vn=- li=2 udi=- nc=1
n40n ( 0) ==>aRegLoad
n40n ( 0) ==>aRegLoad
------------------------------
LLZRGF GPR_0033, Shadow[<vft-symbol>] 0(&GPR_0032) ; Throws Implicit Null Pointer Exception
LARL GPR_0039, 0x000003FF2C543A40 <--- load interface call cache data snippet address
LARL GPR_0040,0x0000000000000000
assocreg
Label L0036:
CL GPR_0033,#400 80(GPR_0039) <--- compare with class pointer from snippet
LG GPR_0073,#401 88(GPR_0039) <--- load method pointer from snippet
BCR BHR(mask=0x8), GPR_0073
CL GPR_0033,#402 96(GPR_0039) <--- compare with class pointer from snippet
LG GPR_0073,#403 104(GPR_0039) <--- load method pointer from snippet
BCR BHR(mask=0x8), GPR_0073
CL GPR_0033,#404 112(GPR_0039) <--- compare with class pointer from snippet
LG GPR_0073,#405 120(GPR_0039) <--- load method pointer from snippet
BCR BHR(mask=0x8), GPR_0073
CL GPR_0033,#406 128(GPR_0039) <--- compare with class pointer from snippet
LG GPR_0073,#407 136(GPR_0039) <--- load method pointer from snippet
BCR BHR(mask=0x8), GPR_0073
CL GPR_0033,#408 144(GPR_0039) <--- compare with class pointer from snippet
LG GPR_0073,#409 152(GPR_0039) <--- load method pointer from snippet
BCR BHR(mask=0x8), GPR_0073
LARL GPR_0039, 0x000003FF2C5439C0
BCR BER(mask=0xf), GPR_0039
.dd Snippet Label L0035
NOP
assocreg
Label L0037:
The repeated checks against cached class pointers that look like
CL <class pointer reg>, <cached classpointer offset>(GPR_0039)
LG GPR_X, <cached method pointer offset>(GPR_0039)
BCR <equal mask>, GPR_X
could use immediate instructions such as
CGFI <class pointer reg>, <cached class pointer constant>
BRCL <equal mask>, <cached method pointer constant>
This would remove the need for loading from the interface call cache data snippet. This would also remove the need for the class pointer/method pointer slots in the snippet.
It appears this functionality is already in the codebase, however it is guarded by a useCLFIandBRCL
flag [1] that is always set to false [2][3].
[1] https://github.com/eclipse-openj9/openj9/blob/e8d07cf4374c9eb4cc369bfc45740f5bacdf54ac/runtime/compiler/z/codegen/S390PrivateLinkage.cpp#L2157 [2] https://github.com/eclipse-openj9/openj9/blob/e8d07cf4374c9eb4cc369bfc45740f5bacdf54ac/runtime/compiler/z/codegen/S390PrivateLinkage.cpp#L2077 [3] https://github.com/eclipse-openj9/openj9/blob/e8d07cf4374c9eb4cc369bfc45740f5bacdf54ac/runtime/compiler/z/codegen/S390PrivateLinkage.cpp#L2125
After doing some git archaeology, it looks like useCLFIandBRCL
has been always false since before open sourcing. @r30shah @joransiu would you know any of the history here as to why this would always be disabled?
Hmm, I'll have to review the history myself to recall. If I had to guess, it's probably some recomp / code patching path wasn't working properly -- so maybe some PicBuilder/Recomp logic is incorrect.
@Spencer-Comin One thing to keep in mind when you use BRCL, you get 4G relative jump target which may be smaller to reach to the interface target.
@Spencer-Comin One thing to keep in mind when you use BRCL, you get 4G relative jump target which may be smaller to reach to the interface target.
Don't we reserve a contiguous virtual memory region for JIT code cache? In that case, JIT to JIT branches are guaranteed to be within +/- 2GB. I guess the scenario we need to care about is if the callee target is not compiled yet?
Don't we reserve a contiguous virtual memory region for JIT code cache? In that case, JIT to JIT branches are guaranteed to be within +/- 2GB. I guess the scenario we need to care about is if the callee target is not compiled yet?
Yes that is true (Looking at the code in [1], it does look that we do reserve contiguous memory reserve for JIT Code cache ). Looking at the interfaceMultislot code, I do see that we would only update the slots in the JIT code when method is compiled [2], Though @Spencer-Comin would appreciate if you can verify this.
We may need to check how the code-patching to update slot is done (If it is implemented and working). With the addresses stored in the snippet, we keep track of lastUpdatedSlot and use that to atomically update the slot in the snippet.
[1]. https://github.com/eclipse-openj9/openj9/blob/master/runtime/compiler/runtime/J9CodeCacheManager.cpp [2]. https://github.com/eclipse-openj9/openj9/blob/067e50cc5f184ce33d372fdee176ff0803c7e1a9/runtime/compiler/z/runtime/PicBuilder.m4#L1690-L1695
I've verified the slots are only updated when the method is compiled.
Here's my test case:
class InterfaceTest {
static interface X {
public int foo();
}
static class A implements X {
public int foo() {
return 1;
}
}
static class B implements X {
public int foo() {
return -1;
}
}
static class C implements X {
public int foo() {
return 2;
}
}
static class D implements X {
public int foo() {
return -2;
}
}
static class E implements X {
public int foo() {
return 3;
}
}
static class F implements X {
public int foo() {
return -3;
}
}
static int unitTest(X x) {
return x.foo();
}
public static void main(String[] args) {
X obj;
A a = new A();
B b = new B();
C c = new C();
D d = new D();
E e = new E();
F f = new F();
int dummy = 0;
for (int i = 0; i < 10; i++) {
if (i % 2 == 0) {
obj = a;
} else {
obj = b;
}
dummy += unitTest(obj);
}
System.out.println(dummy);
}
}
Heres the cache snippet layout from the trace log:
000003FFCAA95210 00000116 Snippet Label L0035: ; Interface call cache data snippet
0x3ffcaa95210 00000116 00 00 03 ff ca a9 51 88 DC 0x00000000caa95188 # Call site RA
0x3ffcaa95218 0000011e 00 00 00 00 00 1e d0 c0 DC 0x00000000001ed0c0 # Address of constant pool
0x3ffcaa95220 00000126 00 00 00 00 00 00 00 03 DC 0x0000000000000003 # CP index
0x3ffcaa95228 0000012e 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # Interface class
0x3ffcaa95230 00000136 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # Method index
0x3ffcaa95238 0000013e 00 00 03 ff ca c7 e9 08 DC 0x00000000cac7e908 # J2I thunk address
0x3ffcaa95240 00000146 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # flags
0x3ffcaa95248 0000014e 00 00 03 ff ca a9 52 50 DC 0x00000000caa95250 # last cached slot
0x3ffcaa95250 00000156 00 00 03 ff ca a9 52 60 DC 0x00000000caa95260 # first slot
0x3ffcaa95258 0000015e 00 00 03 ff ca a9 52 90 DC 0x00000000caa95290 # last slot
0x3ffcaa95260 00000166 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # class pointer 0
0x3ffcaa95268 0000016e 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # method pointer 0
0x3ffcaa95270 00000176 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # class pointer 1
0x3ffcaa95278 0000017e 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # method pointer 1
0x3ffcaa95280 00000186 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # class pointer 2
0x3ffcaa95288 0000018e 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # method pointer 2
0x3ffcaa95290 00000196 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # class pointer 3
0x3ffcaa95298 0000019e 00 00 00 00 00 00 00 00 DC 0x0000000000000000 # method pointer 3
Here's an extract from gdb showing that the cache is only updated after the method is compiled:
$ gdb --args java -Xjit:"disableAsyncCompilation,dontInline={*unitTest*|*foo*},{*unitTest*}(count=0,breakOnEntry,traceFull,log=trace.log),{*foo*}(count=1,breakAfterCompile)" InterfaceTest
...
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x3fffdff7910 (LWP 15924)]
0x000003ffcaa95102 in ?? ()
(gdb) x/18g 000003FFCAA95210
Invalid number "000003FFCAA95210".
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x0000000000000000
0x3ffcaa95230: 0x0000000000000000 0x000003ffcac7e908
0x3ffcaa95240: 0x0000000000000000 0x000003ffcaa95250
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x0000000000000000 0x0000000000000000
0x3ffcaa95270: 0x0000000000000000 0x0000000000000000
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) n
Cannot find bounds of current function
(gdb) c
Continuing.
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
0x000003ffcaa95102 in ?? ()
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x00000000001ed300
0x3ffcaa95230: 0x0000000000000018 0x000003ffcac7e908
0x3ffcaa95240: 0x0100000000000000 0x000003ffcaa95250
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x0000000000000000 0x0000000000000000
0x3ffcaa95270: 0x0000000000000000 0x0000000000000000
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
0x000003ffcaa95102 in ?? ()
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x00000000001ed300
0x3ffcaa95230: 0x0000000000000018 0x000003ffcac7e908
0x3ffcaa95240: 0x0100000000000000 0x000003ffcaa95250
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x0000000000000000 0x0000000000000000
0x3ffcaa95270: 0x0000000000000000 0x0000000000000000
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
=== Finished compiling InterfaceTest$A.foo()I at 0x3ffcaa955fa ===
Thread 4 "JIT Compilation" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x3ffcaa7f910 (LWP 15926)]
raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) c
Continuing.
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x3fffdff7910 (LWP 15924)]
0x000003ffcaa95102 in ?? ()
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x00000000001ed300
0x3ffcaa95230: 0x0000000000000018 0x000003ffcac7e908
0x3ffcaa95240: 0x0100000000000000 0x000003ffcaa95250
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x0000000000000000 0x0000000000000000
0x3ffcaa95270: 0x0000000000000000 0x0000000000000000
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
=== Finished compiling InterfaceTest$B.foo()I at 0x3ffcaa959fa ===
Thread 4 "JIT Compilation" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x3ffcaa7f910 (LWP 15926)]
raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:51
51 in ../sysdeps/unix/sysv/linux/raise.c
(gdb) c
Continuing.
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x3fffdff7910 (LWP 15924)]
0x000003ffcaa95102 in ?? ()
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x00000000001ed300
0x3ffcaa95230: 0x0000000000000018 0x000003ffcac7e908
0x3ffcaa95240: 0x0100000000000000 0x000003ffcaa95250
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x0000000000000000 0x0000000000000000
0x3ffcaa95270: 0x0000000000000000 0x0000000000000000
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
0x000003ffcaa95102 in ?? ()
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x00000000001ed300
0x3ffcaa95230: 0x0000000000000018 0x000003ffcac7e908
0x3ffcaa95240: 0x0100000000000000 0x000003ffcaa95260
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x001ed60000000000 0x000003ffcaa95600
0x3ffcaa95270: 0x0000000000000000 0x0000000000000000
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
Thread 2 "main" received signal SIGTRAP, Trace/breakpoint trap.
0x000003ffcaa95102 in ?? ()
(gdb) x/18xg 0x000003FFCAA95210
0x3ffcaa95210: 0x000003ffcaa95188 0x00000000001ed0c0
0x3ffcaa95220: 0x0000000000000003 0x00000000001ed300
0x3ffcaa95230: 0x0000000000000018 0x000003ffcac7e908
0x3ffcaa95240: 0x0100000000000000 0x000003ffcaa95270
0x3ffcaa95250: 0x000003ffcaa95260 0x000003ffcaa95290
0x3ffcaa95260: 0x001ed60000000000 0x000003ffcaa95600
0x3ffcaa95270: 0x001ed80000000000 0x000003ffcaa95a00
0x3ffcaa95280: 0x0000000000000000 0x0000000000000000
0x3ffcaa95290: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
@Spencer-Comin Thanks a lot for verifying this. I believe next thing for this one is to see if existing path works or not.
Also, we need to make sure that the CGFI
gets patched on Class Unloading/ redefinition. So exercise you did last time, need to be repeated.
Currently interface dispatch compares the class pointer with class pointers cached in the interface call cached data snippet to determine if there is a cached method pointer to jump to. The cached class and method pointers could be cached in the instruction stream instead of the snippet by using immediate instructions.