openjdk-riscv / jdk11u

Read-only mirror of https://hg.openjdk.java.net/jdk-updates/jdk11u/
GNU General Public License v2.0
11 stars 14 forks source link

JVM参数-XX:+TraceBytecodes及-verbose调试记录 #161

Closed DingliZhang closed 2 years ago

DingliZhang commented 3 years ago

-verbose

-verbose 是标准VM选项,其的作用是在输出设备上显示虚拟机运行信息; 其他用法:

-XX:+TraceBytecodes

-XX 代表了隐藏的非标VM选项。 -XX:+TraceBytecodes 的作用是显示完整的字节码执行跟踪信息。 在bishengJDK上的命令:

<qemu64> /home/zhangdingli/bishengjdk-11/build/linux-riscv64-normal-core-slowdebug/jdk/bin/java -XX:+TraceBytecodes -version > TraceBytecodes.log 2>&1

输出结果示例(全部日志大小40GB+,故只列出前几行): image

而在rv32上会报以下错误:

dingli@isrc-iscas:~/isrc-jdk11u/jdk11u$ qemu32 build/linux-riscv32-normal-core-slowdebug/jdk/bin/java -XX:+TraceBytecodes -version
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x3ccf2de8, pid=15049, tid=15051
#
# JRE version:  (11.0.9) (slowdebug build )
# Java VM: OpenJDK Core VM (slowdebug 11.0.9-internal+0-adhoc.dingli.jdk11u, interpreted mode, serial gc, linux-riscv32)
# Problematic frame:
# j  java.lang.Object.<clinit>()V+0 java.base
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /home/dingli/isrc-jdk11u/jdk11u/core.15049)
#
# An error report file with more information is saved as:
# /home/dingli/isrc-jdk11u/jdk11u/hs_err_pid15049.log
Loaded disassembler from /home/dingli/isrc-jdk11u/jdk11u/build/linux-riscv32-normal-core-slowdebug/jdk/lib/server/hsdis-riscv32.so
BFD: unrecognized disassembler option: 
#
#
Current thread is 15051
Dumping core ...
Aborted (core dumped)

其中qemu32qemu64为别名,在~/.bashrc中定义,以qemu32为例:

alias qemu32="/home/dingli/qemu_install/qemu-5.2.0/bin/qemu-riscv32 -L /home/dingli/toolchain/riscv32/sysroot"
DingliZhang commented 3 years ago

继续进行调试,使用rv32g-dev分支,c到上述crash的0x3ccf2de8,具体输出:

Thread 1 hit Breakpoint 2, JavaCalls::call_helper (result=0x3f0e1848, method=..., args=0x3f0e1868, __the_thread__=0x3ed19000)
    at /home/dingli/isrc-jdk11u/jdk11u/src/hotspot/share/runtime/javaCalls.cpp:442
442           StubRoutines::call_stub()(
(gdb) c
Continuing.

Thread 1 received signal SIGILL, Illegal instruction.
0x3ccf2de8 in ?? ()
(gdb) x/20i $pc-40
   0x3ccf2dc0:  lw      a0,0(s4)
   0x3ccf2dc4:  addi    s4,s4,4
   0x3ccf2dc8:  addi    s4,s4,-4
   0x3ccf2dcc:  sw      t0,0(s4)
   0x3ccf2dd0:  addi    s4,s4,-4
   0x3ccf2dd4:  sw      a0,0(s4)
   0x3ccf2dd8:  lui     a0,0x3fe21
   0x3ccf2ddc:  addi    a0,a0,392
   0x3ccf2de0:  lui     t0,0x0
   0x3ccf2de4:  addi    t0,t0,1
=> 0x3ccf2de8:  0x655302f
   0x3ccf2dec:  lw      a0,0(s4)
   0x3ccf2df0:  addi    s4,s4,4
   0x3ccf2df4:  lw      t0,0(s4)
   0x3ccf2df8:  addi    s4,s4,4
   0x3ccf2dfc:  jal     ra,0x3ccdf290
   0x3ccf2e00:  bnez    a0,0x3ccf2e10
   0x3ccf2e04:  lui     t0,0x3cce8
   0x3ccf2e08:  addi    t0,t0,-1576
   0x3ccf2e0c:  jr      t0
(gdb) x/t 0x3ccf2de8
0x3ccf2de8:     00000110010101010011000000101111
(gdb) 

0x655302f的二进制为00000110010101010011000000101111,对应的指令为amoadd.d,这条指令是RV64 ONLY的,在src/hotspot/cpu/riscv32/assembler_riscv32.hpp中发现一些RV64的原子指令需要删除。

DingliZhang commented 3 years ago

除了删除 src/hotspot/cpu/riscv32/assembler_riscv32.hpp 中的指令:

--- a/src/hotspot/cpu/riscv32/assembler_riscv32.hpp
+++ b/src/hotspot/cpu/riscv32/assembler_riscv32.hpp
@@ -818,15 +818,6 @@ enum Aqrl {relaxed = 0b00, rl = 0b01, aq = 0b10, aqrl = 0b11};
   INSN(amomax_w,  0b0101111, 0b010, 0b10100);
   INSN(amominu_w, 0b0101111, 0b010, 0b11000);
   INSN(amomaxu_w, 0b0101111, 0b010, 0b11100);
-  INSN(amoswap_d, 0b0101111, 0b011, 0b00001);
-  INSN(amoadd_d,  0b0101111, 0b011, 0b00000);
-  INSN(amoxor_d,  0b0101111, 0b011, 0b00100);
-  INSN(amoand_d,  0b0101111, 0b011, 0b01100);
-  INSN(amoor_d,   0b0101111, 0b011, 0b01000);
-  INSN(amomin_d,  0b0101111, 0b011, 0b10000);
-  INSN(amomax_d , 0b0101111, 0b011, 0b10100);
-  INSN(amominu_d, 0b0101111, 0b011, 0b11000);
-  INSN(amomaxu_d, 0b0101111, 0b011, 0b11100);

src/hotspot/cpu/riscv32/macroAssembler_riscv32.cpp中也需要作出一些修改: https://github.com/openjdk-riscv/jdk11u/blob/71a7822fbddaaa51eba93e2af5426bc261543142/src/hotspot/cpu/riscv32/macroAssembler_riscv32.cpp#L2374-L2377

DingliZhang commented 3 years ago

提交了 https://github.com/openjdk-riscv/jdk11u/pull/163 ,目前可以打印出到assert(stdc == Bytecodes::_ldc || stdc == Bytecodes::_ldc_w || stdc == Bytecodes::_ldc2_w) failed: load constant 为止的字节码信息,具体输出: image 请 @shining1984 @ZhangXiang1994-max 验证和review。

axiangyushanhaijing commented 3 years ago

已验证

DingliZhang commented 3 years ago

使用以下参数进行调试:

$ qemu32 build/linux-riscv32-normal-core-slowdebug/jdk/bin/java -XX:+TraceBytecodes -verbose -version

发现Bisheng与rv32都是在加载了java.lang.ArithmeticException之后开始打印bytecode。 其中rv32在打印出第二个lconst_0以后就触发了之前的failed: load constant报错, 而bishengJDK与x86 native构建的第二个字节码为return,接下来准备研究这里的字节码生成过程。 rv32完整日志:https://paste.ubuntu.com/p/skW6ZxBCBy/ BishengJDK部分日志:https://paste.ubuntu.com/p/xdq5Vmnnc6/

DingliZhang commented 3 years ago

bytecode.hppjavaCalls.cpp中打上log:

diff --git a/src/hotspot/share/interpreter/bytecode.hpp b/src/hotspot/share/interpreter/bytecode.hpp
index befdd889d2..b1d3a0f5ef 100644
--- a/src/hotspot/share/interpreter/bytecode.hpp
+++ b/src/hotspot/share/interpreter/bytecode.hpp
@@ -322,6 +322,7 @@ class Bytecode_loadconstant: public Bytecode {
   Bytecode_loadconstant(const methodHandle& method, int bci): Bytecode(method(), method->bcp_from(bci)), _method(method()) { verify(); }

   void verify() const {
+    warning("start verify() in Bytecode_loadconstant");
     assert(_method != NULL, "must supply method");
     Bytecodes::Code stdc = Bytecodes::java_code(code());
     assert(stdc == Bytecodes::_ldc ||
diff --git a/src/hotspot/share/runtime/javaCalls.cpp b/src/hotspot/share/runtime/javaCalls.cpp
index 32040586b3..fad93443cd 100644
--- a/src/hotspot/share/runtime/javaCalls.cpp
+++ b/src/hotspot/share/runtime/javaCalls.cpp
@@ -503,6 +503,7 @@ inline oop resolve_indirect_oop(intptr_t value, uint state) {
 }

 intptr_t* JavaCallArguments::parameters() {
+  warning("strat JavaCallArguments::parameters");
   // First convert all handles to oops
   for(int i = 0; i < _size; i++) {
     uint state = _value_state[i];

执行结果: image 对比BishengJDK的结果: image 可以看出除了第二个字节码分别为lconst_0return不同外,第二个字节码之后的执行路径也发生改变。 在X86中进行GDB调试结果: image 可以看出是在执行到StubRoutines::call_stub()中的CHECK之前生成的这两个字节码。详细的调试记录:https://paste.ubuntu.com/p/ZSHdzNVvFB/

DingliZhang commented 3 years ago

20210901更新

此处rv32反汇编指令生成有误,更正:https://github.com/openjdk-riscv/jdk11u/issues/161#issuecomment-910123598


参考 https://github.com/openjdk-riscv/jdk11u/issues/71 将编译好的hsdis-riscv32.so,或者直接wget编译好的hsdis.so(需解压): hsdis-riscv32.so.zip hsdis-riscv64.so.zip 放入到libjvm.so同目录下,例如:jdk/lib/server下, 在执行java -version的时候加上-XX:+PrintInterpreter,可以生成反汇编代码,查看各个字节码的机器码。 rv32反汇编代码:https://paste.ubuntu.com/p/zkfCHXZRPc/ bishengJDK反汇编代码:https://paste.ubuntu.com/p/ZBNQb3c5T8/

对比invokestatic

BishengJDK:

----------------------------------------------------------------------
invokestatic  184 invokestatic  [0x000000400ac49f40, 0x000000400ac4a240]  768 bytes

BFD: unrecognized disassembler option: 
  0x000000400ac49f40: addi  s4,s4,-8
  0x000000400ac49f44: sd    a0,0(s4)
  0x000000400ac49f48: j 0x000000400ac49f80
  0x000000400ac49f4c: addi  s4,s4,-8
  0x000000400ac49f50: fsw   fa0,0(s4)
  0x000000400ac49f54: j 0x000000400ac49f80
  0x000000400ac49f58: addi  s4,s4,-16
  0x000000400ac49f5c: fsd   fa0,0(s4)
  0x000000400ac49f60: j 0x000000400ac49f80
  0x000000400ac49f64: addi  s4,s4,-16
  0x000000400ac49f68: sd    zero,8(s4)
  0x000000400ac49f6c: sd    a0,0(s4)
  0x000000400ac49f70: j 0x000000400ac49f80
  0x000000400ac49f74: addi  s4,s4,-8
  0x000000400ac49f78: addw  a0,a0,zero
  0x000000400ac49f7c: sd    a0,0(s4)
  0x000000400ac49f80: sd    s6,-72(s0)
  0x000000400ac49f84: lhu   a4,1(s6)
  0x000000400ac49f88: slli  t1,a4,0x5
  0x000000400ac49f8c: add   t1,s10,t1
  0x000000400ac49f90: addi  s1,t1,40
  0x000000400ac49f94: mv    s1,s1
  0x000000400ac49f98: fence
  0x000000400ac49f9c: lwu   s1,0(s1)
  0x000000400ac49fa0: fence ir,iorw
  0x000000400ac49fa4: slli  s1,s1,0x28
  0x000000400ac49fa8: srli  s1,s1,0x38
  0x000000400ac49fac: addiw t0,zero,184
  0x000000400ac49fb0: beq   s1,t0,0x000000400ac4a108
  0x000000400ac49fb4: addiw s1,zero,184
  0x000000400ac49fb8: mv    a1,s1
  0x000000400ac49fbc: sd    s6,-72(s0)
  0x000000400ac49fc0: ld    t0,-16(s0)
  0x000000400ac49fc4: beqz  t0,0x000000400ac4a088
  0x000000400ac49fc8: addi  sp,sp,-240
  0x000000400ac49fcc: sd    ra,0(sp)
  0x000000400ac49fd0: sd    gp,8(sp)
  0x000000400ac49fd4: sd    tp,16(sp)
  0x000000400ac49fd8: sd    t0,24(sp)
  0x000000400ac49fdc: sd    t1,32(sp)
  0x000000400ac49fe0: sd    t2,40(sp)
  0x000000400ac49fe4: sd    s0,48(sp)
  0x000000400ac49fe8: sd    s1,56(sp)
  0x000000400ac49fec: sd    a0,64(sp)
  0x000000400ac49ff0: sd    a1,72(sp)
  0x000000400ac49ff4: sd    a2,80(sp)
  0x000000400ac49ff8: sd    a3,88(sp)
  0x000000400ac49ffc: sd    a4,96(sp)
  0x000000400ac4a000: sd    a5,104(sp)
  0x000000400ac4a004: sd    a6,112(sp)
  0x000000400ac4a008: sd    a7,120(sp)
  0x000000400ac4a00c: sd    s2,128(sp)
  0x000000400ac4a010: sd    s3,136(sp)
  0x000000400ac4a014: sd    s4,144(sp)
  0x000000400ac4a018: sd    s5,152(sp)
  0x000000400ac4a01c: sd    s6,160(sp)
  0x000000400ac4a020: sd    s7,168(sp)
  0x000000400ac4a024: sd    s8,176(sp)
  0x000000400ac4a028: sd    s9,184(sp)
  0x000000400ac4a02c: sd    s10,192(sp)
  0x000000400ac4a030: sd    s11,200(sp)
  0x000000400ac4a034: sd    t3,208(sp)
  0x000000400ac4a038: sd    t4,216(sp)
  0x000000400ac4a03c: sd    t5,224(sp)
  0x000000400ac4a040: sd    t6,232(sp)
  0x000000400ac4a044: lui   a0,0x2001
  0x000000400ac4a048: addiw a0,a0,649
  0x000000400ac4a04c: slli  a0,a0,0xd
  0x000000400ac4a050: addi  a0,a0,784 # 0x0000000002001310
  0x000000400ac4a054: lui   a1,0x2005
  0x000000400ac4a058: addiw a1,a1,1573
  0x000000400ac4a05c: slli  a1,a1,0xd
  0x000000400ac4a060: addi  a1,a1,-56 # 0x0000000002004fc8
  0x000000400ac4a064: mv    a2,sp
 ;; 0x4002039fea
  0x000000400ac4a068: lui   a3,0x400
  0x000000400ac4a06c: addi  a3,a3,515 # 0x0000000000400203
  0x000000400ac4a070: slli  a3,a3,0xb
  0x000000400ac4a074: addi  a3,a3,1279
  0x000000400ac4a078: slli  a3,a3,0x5
  0x000000400ac4a07c: addi  a3,a3,10
  0x000000400ac4a080: jalr  a3
  0x000000400ac4a084: ebreak
  0x000000400ac4a088: mv    a0,s7
  0x000000400ac4a08c: auipc t0,0x0
  0x000000400ac4a090: addi  t0,t0,56 # 0x000000400ac4a0c4
  0x000000400ac4a094: sd    t0,896(s7)
  0x000000400ac4a098: sd    s4,888(s7)
  0x000000400ac4a09c: sd    s0,904(s7)
  0x000000400ac4a0a0: addi  sp,sp,-16
  0x000000400ac4a0a4: sd    t1,0(sp)
  0x000000400ac4a0a8: sd    t6,8(sp)
 ;; 0x4001e0d218
  0x000000400ac4a0ac: lui   t0,0x400
  0x000000400ac4a0b0: addi  t0,t0,480 # 0x00000000004001e0
  0x000000400ac4a0b4: slli  t0,t0,0xb
  0x000000400ac4a0b8: addi  t0,t0,1680
  0x000000400ac4a0bc: slli  t0,t0,0x5
  0x000000400ac4a0c0: jalr  24(t0)
  0x000000400ac4a0c4: ld    t1,0(sp)
  0x000000400ac4a0c8: ld    t6,8(sp)
  0x000000400ac4a0cc: addi  sp,sp,16
  0x000000400ac4a0d0: fence.i
  0x000000400ac4a0d4: fence ir,ir
  0x000000400ac4a0d8: sd    zero,888(s7)
  0x000000400ac4a0dc: sd    zero,904(s7)
  0x000000400ac4a0e0: sd    zero,896(s7)
  0x000000400ac4a0e4: ld    t0,8(s7)
  0x000000400ac4a0e8: beqz  t0,0x000000400ac4a0f4
  0x000000400ac4a0ec: auipc t0,0xfffde
  0x000000400ac4a0f0: jr    724(t0) # 0x000000400ac283c0
  0x000000400ac4a0f4: ld    s6,-72(s0)
  0x000000400ac4a0f8: ld    s8,-64(s0)
  0x000000400ac4a0fc: lhu   a4,1(s6)
  0x000000400ac4a100: slli  t1,a4,0x5
  0x000000400ac4a104: add   t1,s10,t1
  0x000000400ac4a108: ld    t6,48(t1)
  0x000000400ac4a10c: lwu   a3,64(t1)
  0x000000400ac4a110: slli  t1,a3,0x20
  0x000000400ac4a114: srli  t1,t1,0x3c
 ;; 0x4002827698
  0x000000400ac4a118: lui   t0,0x400
  0x000000400ac4a11c: addi  t0,t0,642 # 0x0000000000400282
  0x000000400ac4a120: slli  t0,t0,0xb
  0x000000400ac4a124: addi  t0,t0,948
  0x000000400ac4a128: slli  t0,t0,0x5
  0x000000400ac4a12c: addi  t0,t0,24
  0x000000400ac4a130: slli  t1,t1,0x3
  0x000000400ac4a134: add   t0,t0,t1
  0x000000400ac4a138: ld    ra,0(t0)
  0x000000400ac4a13c: mv    t5,sp
  0x000000400ac4a140: sd    s4,-16(s0)
  0x000000400ac4a144: ld    t0,88(t6)
  0x000000400ac4a148: jr    t0
  0x000000400ac4a14c: addi  sp,sp,-240
  0x000000400ac4a150: sd    ra,0(sp)
  0x000000400ac4a154: sd    gp,8(sp)
  0x000000400ac4a158: sd    tp,16(sp)
  0x000000400ac4a15c: sd    t0,24(sp)
  0x000000400ac4a160: sd    t1,32(sp)
  0x000000400ac4a164: sd    t2,40(sp)
  0x000000400ac4a168: sd    s0,48(sp)
  0x000000400ac4a16c: sd    s1,56(sp)
  0x000000400ac4a170: sd    a0,64(sp)
  0x000000400ac4a174: sd    a1,72(sp)
  0x000000400ac4a178: sd    a2,80(sp)
  0x000000400ac4a17c: sd    a3,88(sp)
  0x000000400ac4a180: sd    a4,96(sp)
  0x000000400ac4a184: sd    a5,104(sp)
  0x000000400ac4a188: sd    a6,112(sp)
  0x000000400ac4a18c: sd    a7,120(sp)
  0x000000400ac4a190: sd    s2,128(sp)
  0x000000400ac4a194: sd    s3,136(sp)
  0x000000400ac4a198: sd    s4,144(sp)
  0x000000400ac4a19c: sd    s5,152(sp)
  0x000000400ac4a1a0: sd    s6,160(sp)
  0x000000400ac4a1a4: sd    s7,168(sp)
  0x000000400ac4a1a8: sd    s8,176(sp)
  0x000000400ac4a1ac: sd    s9,184(sp)
  0x000000400ac4a1b0: sd    s10,192(sp)
  0x000000400ac4a1b4: sd    s11,200(sp)
  0x000000400ac4a1b8: sd    t3,208(sp)
  0x000000400ac4a1bc: sd    t4,216(sp)
  0x000000400ac4a1c0: sd    t5,224(sp)
  0x000000400ac4a1c4: sd    t6,232(sp)
  0x000000400ac4a1c8: lui   a0,0x2001
  0x000000400ac4a1cc: addiw a0,a0,649
  0x000000400ac4a1d0: slli  a0,a0,0xd
  0x000000400ac4a1d4: addi  a0,a0,-784 # 0x0000000002000cf0
  0x000000400ac4a1d8: lui   a1,0x2005
  0x000000400ac4a1dc: addiw a1,a1,1573
  0x000000400ac4a1e0: slli  a1,a1,0xd
  0x000000400ac4a1e4: addi  a1,a1,332 # 0x000000000200514c
  0x000000400ac4a1e8: mv    a2,sp
 ;; 0x4002039fea
  0x000000400ac4a1ec: lui   a3,0x400
  0x000000400ac4a1f0: addi  a3,a3,515 # 0x0000000000400203
  0x000000400ac4a1f4: slli  a3,a3,0xb
  0x000000400ac4a1f8: addi  a3,a3,1279
  0x000000400ac4a1fc: slli  a3,a3,0x5
  0x000000400ac4a200: addi  a3,a3,10
  0x000000400ac4a204: jalr  a3
  0x000000400ac4a208: ebreak
  0x000000400ac4a20c: nop
  0x000000400ac4a210: sw    a1,28(s1)
  0x000000400ac4a212: sw    a1,28(s1)
  0x000000400ac4a214: sw    a1,28(s1)
  0x000000400ac4a216: sw    a1,28(s1)
  0x000000400ac4a218: sw    a1,28(s1)
  0x000000400ac4a21a: sw    a1,28(s1)
  0x000000400ac4a21c: sw    a1,28(s1)
  0x000000400ac4a21e: sw    a1,28(s1)
  0x000000400ac4a220: sw    a1,28(s1)
  0x000000400ac4a222: sw    a1,28(s1)
  0x000000400ac4a224: sw    a1,28(s1)
  0x000000400ac4a226: sw    a1,28(s1)
  0x000000400ac4a228: sw    a1,28(s1)
  0x000000400ac4a22a: sw    a1,28(s1)
  0x000000400ac4a22c: sw    a1,28(s1)
  0x000000400ac4a22e: sw    a1,28(s1)
  0x000000400ac4a230: sw    a1,28(s1)
  0x000000400ac4a232: sw    a1,28(s1)
  0x000000400ac4a234: sw    a1,28(s1)
  0x000000400ac4a236: sw    a1,28(s1)
  0x000000400ac4a238: sw    a1,28(s1)
  0x000000400ac4a23a: sw    a1,28(s1)
  0x000000400ac4a23c: sw    a1,28(s1)
  0x000000400ac4a23e: sw    a1,28(s1)

rv32:

----------------------------------------------------------------------
invokestatic  184 invokestatic  [0x3ccfc840, 0x3ccfcb00]  704 bytes

BFD: unrecognized disassembler option: 
  0x3ccfc840: addi  s4,s4,-4
  0x3ccfc844: sw    a0,0(s4)
  0x3ccfc848: j 0x3ccfc880
  0x3ccfc84c: addi  s4,s4,-4
  0x3ccfc850: fsw   fa0,0(s4)
  0x3ccfc854: j 0x3ccfc880
  0x3ccfc858: addi  s4,s4,-8
  0x3ccfc85c: fsd   fa0,0(s4)
  0x3ccfc860: j 0x3ccfc880
  0x3ccfc864: addi  s4,s4,-8
  0x3ccfc868: sw    zero,4(s4)
  0x3ccfc86c: sw    a0,0(s4)
  0x3ccfc870: j 0x3ccfc880
  0x3ccfc874: addi  s4,s4,-4
  0x3ccfc878: add   a0,a0,zero
  0x3ccfc87c: sw    a0,0(s4)
  0x3ccfc880: addi  s4,s4,-4
  0x3ccfc884: sw    t0,0(s4)
  0x3ccfc888: addi  s4,s4,-4
  0x3ccfc88c: sw    a0,0(s4)
  0x3ccfc890: lui   a0,0x3fe21
  0x3ccfc894: addi  a0,a0,392 # 0x3fe21188
  0x3ccfc898: lui   t0,0x0
  0x3ccfc89c: addi  t0,t0,1 # 1
  0x3ccfc8a0: amoadd.w.aqrl zero,t0,(a0)
  0x3ccfc8a4: lw    a0,0(s4)
  0x3ccfc8a8: addi  s4,s4,4
  0x3ccfc8ac: lw    t0,0(s4)
  0x3ccfc8b0: addi  s4,s4,4
  0x3ccfc8b4: jal   ra,0x3ccdf998
  0x3ccfc8b8: sw    s6,-36(s0)
  0x3ccfc8bc: lhu   a4,1(s6)
  0x3ccfc8c0: slli  t1,a4,0x5
  0x3ccfc8c4: add   t1,s10,t1
  0x3ccfc8c8: addi  s1,t1,16
  0x3ccfc8cc: mv    s1,s1
  0x3ccfc8d0: fence
  0x3ccfc8d4: lw    s1,0(s1)
  0x3ccfc8d8: fence ir,iorw
  0x3ccfc8dc: slli  s1,s1,0x8
  0x3ccfc8e0: srli  s1,s1,0x18
  0x3ccfc8e4: lui   t0,0x0
  0x3ccfc8e8: addi  t0,t0,184 # 0x000000b8
  0x3ccfc8ec: beq   s1,t0,0x3ccfca18
  0x3ccfc8f0: lui   s1,0x0
  0x3ccfc8f4: addi  s1,s1,184 # 0x000000b8
  0x3ccfc8f8: mv    a1,s1
  0x3ccfc8fc: sw    s6,-36(s0)
  0x3ccfc900: lw    t0,-8(s0)
  0x3ccfc904: beqz  t0,0x3ccfc9a8
  0x3ccfc908: addi  sp,sp,-120
  0x3ccfc90c: sw    ra,0(sp)
  0x3ccfc910: sw    gp,4(sp)
  0x3ccfc914: sw    tp,8(sp)
  0x3ccfc918: sw    t0,12(sp)
  0x3ccfc91c: sw    t1,16(sp)
  0x3ccfc920: sw    t2,20(sp)
  0x3ccfc924: sw    s0,24(sp)
  0x3ccfc928: sw    s1,28(sp)
  0x3ccfc92c: sw    a0,32(sp)
  0x3ccfc930: sw    a1,36(sp)
  0x3ccfc934: sw    a2,40(sp)
  0x3ccfc938: sw    a3,44(sp)
  0x3ccfc93c: sw    a4,48(sp)
  0x3ccfc940: sw    a5,52(sp)
  0x3ccfc944: sw    a6,56(sp)
  0x3ccfc948: sw    a7,60(sp)
  0x3ccfc94c: sw    s2,64(sp)
  0x3ccfc950: sw    s3,68(sp)
  0x3ccfc954: sw    s4,72(sp)
  0x3ccfc958: sw    s5,76(sp)
  0x3ccfc95c: sw    s6,80(sp)
  0x3ccfc960: sw    s7,84(sp)
  0x3ccfc964: sw    s8,88(sp)
  0x3ccfc968: sw    s9,92(sp)
  0x3ccfc96c: sw    s10,96(sp)
  0x3ccfc970: sw    s11,100(sp)
  0x3ccfc974: sw    t3,104(sp)
  0x3ccfc978: sw    t4,108(sp)
  0x3ccfc97c: sw    t5,112(sp)
  0x3ccfc980: sw    t6,116(sp)
  0x3ccfc984: lui   a0,0x3fbed
  0x3ccfc988: addi  a0,a0,-1072 # 0x3fbecbd0
  0x3ccfc98c: lui   a1,0x3ccfd
  0x3ccfc990: addi  a1,a1,-1784 # 0x3ccfc908
  0x3ccfc994: mv    a2,sp
  0x3ccfc998: lui   a3,0x3f765
  0x3ccfc99c: addi  a3,a3,1458 # 0x3f7655b2 = MacroAssembler::debug32(char*, int, int*)
  0x3ccfc9a0: jalr  a3
  0x3ccfc9a4: ebreak
  0x3ccfc9a8: mv    a0,s7
  0x3ccfc9ac: auipc t0,0x0
  0x3ccfc9b0: addi  t0,t0,40 # 0x3ccfc9d4
  0x3ccfc9b4: sw    t0,628(s7)
  0x3ccfc9b8: sw    s4,624(s7)
  0x3ccfc9bc: sw    s0,632(s7)
  0x3ccfc9c0: addi  sp,sp,-8
  0x3ccfc9c4: sw    t1,0(sp)
  0x3ccfc9c8: sw    t6,4(sp)
  0x3ccfc9cc: lui   t0,0x3f546
  0x3ccfc9d0: jalr  -1136(t0) # 0x3f545b90 = InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)
  0x3ccfc9d4: lw    t1,0(sp)
  0x3ccfc9d8: lw    t6,4(sp)
  0x3ccfc9dc: addi  sp,sp,8
  0x3ccfc9e0: fence.i
  0x3ccfc9e4: fence ir,ir
  0x3ccfc9e8: sw    zero,624(s7)
  0x3ccfc9ec: sw    zero,632(s7)
  0x3ccfc9f0: sw    zero,628(s7)
  0x3ccfc9f4: lw    t0,4(s7)
  0x3ccfc9f8: beqz  t0,0x3ccfca04
  0x3ccfc9fc: auipc t0,0xfffdd
  0x3ccfca00: jr    -1788(t0) # 0x3ccd9300
  0x3ccfca04: lw    s6,-36(s0)
  0x3ccfca08: lw    s8,-32(s0)
  0x3ccfca0c: lhu   a4,1(s6)
  0x3ccfca10: slli  t1,a4,0x5
  0x3ccfca14: add   t1,s10,t1
  0x3ccfca18: lw    t6,20(t1)
  0x3ccfca1c: lw    a3,28(t1)
  0x3ccfca20: slli  t1,a3,0x0
  0x3ccfca24: srli  t1,t1,0x1c
  0x3ccfca28: lui   t0,0x3fe92
  0x3ccfca2c: addi  t0,t0,-1100 # 0x3fe91bb4
  0x3ccfca30: slli  t1,t1,0x3
  0x3ccfca34: add   t0,t0,t1
  0x3ccfca38: lw    ra,0(t0)
  0x3ccfca3c: mv    t5,sp
  0x3ccfca40: sw    s4,-8(s0)
  0x3ccfca44: lw    t0,52(t6)
  0x3ccfca48: jr    t0
  0x3ccfca4c: addi  sp,sp,-120
  0x3ccfca50: sw    ra,0(sp)
  0x3ccfca54: sw    gp,4(sp)
  0x3ccfca58: sw    tp,8(sp)
  0x3ccfca5c: sw    t0,12(sp)
  0x3ccfca60: sw    t1,16(sp)
  0x3ccfca64: sw    t2,20(sp)
  0x3ccfca68: sw    s0,24(sp)
  0x3ccfca6c: sw    s1,28(sp)
  0x3ccfca70: sw    a0,32(sp)
  0x3ccfca74: sw    a1,36(sp)
  0x3ccfca78: sw    a2,40(sp)
  0x3ccfca7c: sw    a3,44(sp)
  0x3ccfca80: sw    a4,48(sp)
  0x3ccfca84: sw    a5,52(sp)
  0x3ccfca88: sw    a6,56(sp)
  0x3ccfca8c: sw    a7,60(sp)
  0x3ccfca90: sw    s2,64(sp)
  0x3ccfca94: sw    s3,68(sp)
  0x3ccfca98: sw    s4,72(sp)
  0x3ccfca9c: sw    s5,76(sp)
  0x3ccfcaa0: sw    s6,80(sp)
  0x3ccfcaa4: sw    s7,84(sp)
  0x3ccfcaa8: sw    s8,88(sp)
  0x3ccfcaac: sw    s9,92(sp)
  0x3ccfcab0: sw    s10,96(sp)
  0x3ccfcab4: sw    s11,100(sp)
  0x3ccfcab8: sw    t3,104(sp)
  0x3ccfcabc: sw    t4,108(sp)
  0x3ccfcac0: sw    t5,112(sp)
  0x3ccfcac4: sw    t6,116(sp)
  0x3ccfcac8: lui   a0,0x3fbec
  0x3ccfcacc: addi  a0,a0,1524 # 0x3fbec5f4
  0x3ccfcad0: lui   a1,0x3ccfd
  0x3ccfcad4: addi  a1,a1,-1460 # 0x3ccfca4c
  0x3ccfcad8: mv    a2,sp
  0x3ccfcadc: lui   a3,0x3f765
  0x3ccfcae0: addi  a3,a3,1458 # 0x3f7655b2 = MacroAssembler::debug32(char*, int, int*)
  0x3ccfcae4: jalr  a3
  0x3ccfcae8: ebreak
  0x3ccfcaec: sw    a1,28(s1)
  0x3ccfcaee: sw    a1,28(s1)
  0x3ccfcaf0: sw    a1,28(s1)
  0x3ccfcaf2: sw    a1,28(s1)
  0x3ccfcaf4: sw    a1,28(s1)
  0x3ccfcaf6: sw    a1,28(s1)
  0x3ccfcaf8: sw    a1,28(s1)
  0x3ccfcafa: sw    a1,28(s1)
  0x3ccfcafc: sw    a1,28(s1)
  0x3ccfcafe: sw    a1,28(s1)

接下来准备分析以及对比字节码returnlconst_0

DingliZhang commented 3 years ago

imageb bytecodeTracer.cpp:111这里打上断点,调试后发现,第一第二个字节码分别为:Bytecodes::_invokestaticBytecodes::_lconst_0,而第二个字节码根据其他平台和架构的结果来进行对比,应为return

我们知道在模板解释器中执行字节码指令的时候,所有的字节码指令都会通过src/hotspot/share/interpreter/templateInterpreterGenerator.cpp中的TemplateInterpreterGenerator::generate_and_dispatch函数来生成对应的字节码,在该函数中实现了指令跳转(取下一条字节码指令)的逻辑:

void TemplateInterpreterGenerator::generate_and_dispatch(Template* t, TosState tos_out) {
  if (PrintBytecodeHistogram)                                    histogram_bytecode(t);
  // ... 此处省略部分代码
  // advance 这里是取下一条指令
  if (t->does_dispatch()) {
#ifdef ASSERT
    // make sure execution doesn't go beyond this point if code is broken
    __ should_not_reach_here();
#endif // ASSERT
  } else {
    // dispatch to next bytecode
    __ dispatch_epilog(tos_out, step); // 这里是取下一条字节码指令
  }
}

其中dispatch_epilog()会调用dispatch_next()函数,位于src/hotspot/cpu/riscv32/interp_masm_riscv32.cpp

void InterpreterMacroAssembler::dispatch_next(TosState state, int step, bool generate_poll) {
  // load next bytecode 加载下一个字节码指令
  load_unsigned_byte(t0, Address(xbcp, step));
  add(xbcp, xbcp, step);
  dispatch_base(state, Interpreter::dispatch_table(state), true, generate_poll);
}

结合上一条comment,目前的思路: 1.字节码生成的出现的了问题,在反汇编器生成的汇编代码中含有大量的原子指令,这里要结合模板表部分代码(src/hotspot/cpu/riscv32/templateTable_riscv32.cpp)来查看 2.字节码之间的跳转出现了问题,这里要跟踪执行引擎的执行路径,查看取指逻辑及各字节码的入口

axiangyushanhaijing commented 3 years ago

根据 @DingliZhang 的分析“在b bytecodeTracer.cpp:111这里打上断点,调试后发现,第一第二个字节码分别为:Bytecodes::_invokestatic、Bytecodes::_lconst_0,而第二个字节码根据其他平台和架构的结果来进行对比,应为return。” 那么理论上rv32中_lconst_0的调用次数是否应该比毕昇的多呢? 在两个指令中打log,统计调用次数,发现均是一样的 image image 目前不知道这个指令的调用次数有没有影响。需要继续追溯

DingliZhang commented 3 years ago

根据 @DingliZhang 的分析“在b bytecodeTracer.cpp:111这里打上断点,调试后发现,第一第二个字节码分别为:Bytecodes::_invokestatic、Bytecodes::_lconst_0,而第二个字节码根据其他平台和架构的结果来进行对比,应为return。” 那么理论上rv32中_lconst_0的调用次数是否应该比毕昇的多呢? 在两个指令中打log,统计调用次数,发现均是一样的 image image 目前不知道这个指令的调用次数有没有影响。需要继续追溯

Hi @zhangxiang-plct 这里的TemplateTable::lconst是在 https://github.com/openjdk-riscv/jdk11u/blob/bd072669c1e37ee9500c1729f8c8e2413d22f703/src/hotspot/share/interpreter/bytecodes.cpp#L268-L294 也就是在字节码生成器(generator)这里调用的,作用是生成该指令对应的汇编指令。并且Bytecodes::initialize只会在JVM启动/初始化的时候调用一次,而lcount生成器会生成lcount_0lcount_1两个字节码的汇编指令 (Bytecodes::initialize()函数中的def的gengrator参数都为lcountargument分别为1和2),因此rv32和bishengJDK调用的次数都为2是合理的。

axiangyushanhaijing commented 3 years ago

在b bytecodeTracer.cpp:111这里打上断点,然后进行汇编调试: 毕昇jdk的调试记录

rv32的调试记录

观察两者的汇编指令,发现rv32挂掉之前并没有什么不同,rv32在 “1575 IRT_END 1: x/i $pc” 之后接着c,就会挂掉。

而毕昇在此处继续C,则是进行下一轮trace的调用,接着执行下一个指令 “1: x/i $pc => 0x4001ac7030 <BytecodePrinter::trace(methodHandle const&, unsigned char, unsigned long, unsigned long, outputStream)+406>: jal ra,0x4001ac2680 <BytecodeCounter::counter_value()> (gdb) p _code $3 = Bytecodes::_iconst_1”


8.25晚间更新 rv32 在gdb中断b InterpreterRuntime::trace_bytecode( b interpreterRuntime.cpp:1569) image

毕昇jdk image

对比两者的tos数值变化,源码给出的定义intptr_t是int,数值看上去像地址的变化,rv32的第一次指令执行的tos变化与毕昇相差很大。需要了解这个tos的作用


8.25二次更新

观察毕昇jdk的调试记录

image 我们可以猜测: `Bytecodes::_invokestatic---->bcp:0x400cd50f98 Bytecodes::_return ----->bcp:0x400cd50f9b ' 也就是说在这个过程中,bcp的数值+3了,或者可以说字节码的指针偏移量为3。 同时对照rv32来看 指令:Bytecodes::_invokestatic---->bcp:0x2c4de3f0 Bytecodes::_lconst_0----->bcp:0x2c4de3f1 这个过程中字节码的指针偏移量是1。 猜测正确的bcp的偏移应该是+3,rv32的第二个指令执行过程中出现问题,导致bcp只偏移了1.中间缺少了2个偏移量。

为了验证这个猜测,在 rv32打log

image 对照毕昇32的调试记录可以证明猜测,要想第二个指令是“return--177”,需要bcp在第二次的调用过程(或者说指令生成过程)中偏移3,由于只偏移了1,导致第二个指令生成不正确。

猜测:从第二个指令开始,字节码的偏移就出错了,需要找到这个bcp是如何定义的。


8.26更新

断点

b main
b bytecodeTracer.cpp:111
c到断点处
b interp_masm_riscv32.cpp:555

由于加了打印指令length的log,多了一些输出warning(" the code is %d length is %d",code,_lengths[code] & 0xF);

2141235] static void java.lang.Object.<clinit>()
[New Thread 1.2141321]

Thread 1 hit Breakpoint 2, BytecodePrinter::trace (this=0x3fe506f8 <std_closure>, method=..., bcp=0x2c4de3f0 "\270\t", tos=1057695360, tos2=0, st=0x3ed00628)
    at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/bytecodeTracer.cpp:111
111         _code = code;
(gdb) b bytecodes.hpp:397
Breakpoint 3 at 0x3f1c86c4: file /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/bytecodes.hpp, line 398.
(gdb) b interp_masm_riscv32.cpp:555
Breakpoint 4 at 0x3f51861c: file /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/interp_masm_riscv32.cpp, line 557.
(gdb) c
Continuing.
[2141235]        1     0  invokestatic
Thread 1 hit Breakpoint 3, Bytecodes::length_for (code=Bytecodes::_invokestatic) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/bytecodes.hpp:398
warning: Source file is more recent than executable.
398         warning(" the code is %d length is %d",code,_lengths[code] & 0xF);
(gdb) n
OpenJDK Core VM warning: 
 the code is 184 length is 3
399          return is_valid(code) ? _lengths[code] & 0xF : -1; }
(gdb) c
Continuing.
 16 <java/lang/Object.registerNatives()V> 

Thread 1 hit Breakpoint 2, BytecodePrinter::trace (this=0x3fe506f8 <std_closure>, method=..., bcp=0x2c4de3f1 "\t", tos=743302128, tos2=0, st=0x3ed00628)
    at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/bytecodeTracer.cpp:111
111         _code = code;
(gdb) c
Continuing.
[2141235]        2     1  lconst_0
Thread 1 hit Breakpoint 3, Bytecodes::length_for (code=Bytecodes::_lconst_0) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/bytecodes.hpp:398
398         warning(" the code is %d length is %d",code,_lengths[code] & 0xF);
(gdb) n
OpenJDK Core VM warning: 
 the code is 9 length is 1
399          return is_valid(code) ? _lengths[code] & 0xF : -1; }
(gdb) c
Continuing.

在gdb的调试过程中发现,围绕指令的执行过程trace(),无法调用到dispatch_next()函数

DingliZhang commented 3 years ago

-XX:+TraceBytecodes打印出的日志进行分析:

[4360] static void java.lang.Object.<clinit>()
[4360]        1     0  invokestatic 16 <java/lang/Object.registerNatives()V> 
[4360]        2     1  lconst_0

add(xbcp, xbcp, step)这里的step是字节码指令的步长,是在模板表的format字段指定的: https://github.com/openjdk-riscv/jdk11u/blob/bd072669c1e37ee9500c1729f8c8e2413d22f703/src/hotspot/share/interpreter/bytecodes.cpp#L282-L538 计算step的逻辑在函数TemplateInterpreterGenerator::generate_and_dispatch中实现,而其中实际上是用到了Bytecodes::_lengthsBytecodes::_lengths的计算位于bytecodes.cpp这里: https://github.com/openjdk-riscv/jdk11u/blob/bd072669c1e37ee9500c1729f8c8e2413d22f703/src/hotspot/share/interpreter/bytecodes.cpp#L157-L164 可以看出长度除了format以外,与wide f.也相关。

举例字节码nopformatb,则表示该字节码包括操作码和操作数,其总长度只有1,也就是1字节。 考虑是否是rv32在启动过程中的第一个方法static void java.lang.Object.<clinit>()里的invokestatic字节码step计算有问题。

axiangyushanhaijing commented 3 years ago

参考 #71 将编译好的hsdis-riscv32.so,或者直接wget编译好的hsdis.so(需解压): hsdis-riscv32.so.zip hsdis-riscv64.so.zip 放入到libjvm.so同目录下,例如:jdk/lib/server下, 在执行java -version的时候加上-XX:+PrintInterpreter,可以生成反汇编代码,查看各个字节码的机器码。 rv32反汇编代码:https://paste.ubuntu.com/p/zkfCHXZRPc/ bishengJDK反汇编代码:https://paste.ubuntu.com/p/ZBNQb3c5T8/

对比invokestatic

BishengJDK:

----------------------------------------------------------------------
invokestatic  184 invokestatic  [0x000000400ac49f40, 0x000000400ac4a240]  768 bytes

BFD: unrecognized disassembler option: 
  0x000000400ac49f40: addi    s4,s4,-8
  0x000000400ac49f44: sd  a0,0(s4)
  0x000000400ac49f48: j   0x000000400ac49f80
  0x000000400ac49f4c: addi    s4,s4,-8
  0x000000400ac49f50: fsw fa0,0(s4)
  0x000000400ac49f54: j   0x000000400ac49f80
  0x000000400ac49f58: addi    s4,s4,-16
  0x000000400ac49f5c: fsd fa0,0(s4)
  0x000000400ac49f60: j   0x000000400ac49f80
  0x000000400ac49f64: addi    s4,s4,-16
  0x000000400ac49f68: sd  zero,8(s4)
  0x000000400ac49f6c: sd  a0,0(s4)
  0x000000400ac49f70: j   0x000000400ac49f80
  0x000000400ac49f74: addi    s4,s4,-8
  0x000000400ac49f78: addw    a0,a0,zero
  0x000000400ac49f7c: sd  a0,0(s4)
  0x000000400ac49f80: sd  s6,-72(s0)
  0x000000400ac49f84: lhu a4,1(s6)
  0x000000400ac49f88: slli    t1,a4,0x5
  0x000000400ac49f8c: add t1,s10,t1
  0x000000400ac49f90: addi    s1,t1,40
  0x000000400ac49f94: mv  s1,s1
  0x000000400ac49f98: fence
  0x000000400ac49f9c: lwu s1,0(s1)
  0x000000400ac49fa0: fence   ir,iorw
  0x000000400ac49fa4: slli    s1,s1,0x28
  0x000000400ac49fa8: srli    s1,s1,0x38
  0x000000400ac49fac: addiw   t0,zero,184
  0x000000400ac49fb0: beq s1,t0,0x000000400ac4a108
  0x000000400ac49fb4: addiw   s1,zero,184
  0x000000400ac49fb8: mv  a1,s1
  0x000000400ac49fbc: sd  s6,-72(s0)
  0x000000400ac49fc0: ld  t0,-16(s0)
  0x000000400ac49fc4: beqz    t0,0x000000400ac4a088
  0x000000400ac49fc8: addi    sp,sp,-240
  0x000000400ac49fcc: sd  ra,0(sp)
  0x000000400ac49fd0: sd  gp,8(sp)
  0x000000400ac49fd4: sd  tp,16(sp)
  0x000000400ac49fd8: sd  t0,24(sp)
  0x000000400ac49fdc: sd  t1,32(sp)
  0x000000400ac49fe0: sd  t2,40(sp)
  0x000000400ac49fe4: sd  s0,48(sp)
  0x000000400ac49fe8: sd  s1,56(sp)
  0x000000400ac49fec: sd  a0,64(sp)
  0x000000400ac49ff0: sd  a1,72(sp)
  0x000000400ac49ff4: sd  a2,80(sp)
  0x000000400ac49ff8: sd  a3,88(sp)
  0x000000400ac49ffc: sd  a4,96(sp)
  0x000000400ac4a000: sd  a5,104(sp)
  0x000000400ac4a004: sd  a6,112(sp)
  0x000000400ac4a008: sd  a7,120(sp)
  0x000000400ac4a00c: sd  s2,128(sp)
  0x000000400ac4a010: sd  s3,136(sp)
  0x000000400ac4a014: sd  s4,144(sp)
  0x000000400ac4a018: sd  s5,152(sp)
  0x000000400ac4a01c: sd  s6,160(sp)
  0x000000400ac4a020: sd  s7,168(sp)
  0x000000400ac4a024: sd  s8,176(sp)
  0x000000400ac4a028: sd  s9,184(sp)
  0x000000400ac4a02c: sd  s10,192(sp)
  0x000000400ac4a030: sd  s11,200(sp)
  0x000000400ac4a034: sd  t3,208(sp)
  0x000000400ac4a038: sd  t4,216(sp)
  0x000000400ac4a03c: sd  t5,224(sp)
  0x000000400ac4a040: sd  t6,232(sp)
  0x000000400ac4a044: lui a0,0x2001
  0x000000400ac4a048: addiw   a0,a0,649
  0x000000400ac4a04c: slli    a0,a0,0xd
  0x000000400ac4a050: addi    a0,a0,784 # 0x0000000002001310
  0x000000400ac4a054: lui a1,0x2005
  0x000000400ac4a058: addiw   a1,a1,1573
  0x000000400ac4a05c: slli    a1,a1,0xd
  0x000000400ac4a060: addi    a1,a1,-56 # 0x0000000002004fc8
  0x000000400ac4a064: mv  a2,sp
 ;; 0x4002039fea
  0x000000400ac4a068: lui a3,0x400
  0x000000400ac4a06c: addi    a3,a3,515 # 0x0000000000400203
  0x000000400ac4a070: slli    a3,a3,0xb
  0x000000400ac4a074: addi    a3,a3,1279
  0x000000400ac4a078: slli    a3,a3,0x5
  0x000000400ac4a07c: addi    a3,a3,10
  0x000000400ac4a080: jalr    a3
  0x000000400ac4a084: ebreak
  0x000000400ac4a088: mv  a0,s7
  0x000000400ac4a08c: auipc   t0,0x0
  0x000000400ac4a090: addi    t0,t0,56 # 0x000000400ac4a0c4
  0x000000400ac4a094: sd  t0,896(s7)
  0x000000400ac4a098: sd  s4,888(s7)
  0x000000400ac4a09c: sd  s0,904(s7)
  0x000000400ac4a0a0: addi    sp,sp,-16
  0x000000400ac4a0a4: sd  t1,0(sp)
  0x000000400ac4a0a8: sd  t6,8(sp)
 ;; 0x4001e0d218
  0x000000400ac4a0ac: lui t0,0x400
  0x000000400ac4a0b0: addi    t0,t0,480 # 0x00000000004001e0
  0x000000400ac4a0b4: slli    t0,t0,0xb
  0x000000400ac4a0b8: addi    t0,t0,1680
  0x000000400ac4a0bc: slli    t0,t0,0x5
  0x000000400ac4a0c0: jalr    24(t0)
  0x000000400ac4a0c4: ld  t1,0(sp)
  0x000000400ac4a0c8: ld  t6,8(sp)
  0x000000400ac4a0cc: addi    sp,sp,16
  0x000000400ac4a0d0: fence.i
  0x000000400ac4a0d4: fence   ir,ir
  0x000000400ac4a0d8: sd  zero,888(s7)
  0x000000400ac4a0dc: sd  zero,904(s7)
  0x000000400ac4a0e0: sd  zero,896(s7)
  0x000000400ac4a0e4: ld  t0,8(s7)
  0x000000400ac4a0e8: beqz    t0,0x000000400ac4a0f4
  0x000000400ac4a0ec: auipc   t0,0xfffde
  0x000000400ac4a0f0: jr  724(t0) # 0x000000400ac283c0
  0x000000400ac4a0f4: ld  s6,-72(s0)
  0x000000400ac4a0f8: ld  s8,-64(s0)
  0x000000400ac4a0fc: lhu a4,1(s6)
  0x000000400ac4a100: slli    t1,a4,0x5
  0x000000400ac4a104: add t1,s10,t1
  0x000000400ac4a108: ld  t6,48(t1)
  0x000000400ac4a10c: lwu a3,64(t1)
  0x000000400ac4a110: slli    t1,a3,0x20
  0x000000400ac4a114: srli    t1,t1,0x3c
 ;; 0x4002827698
  0x000000400ac4a118: lui t0,0x400
  0x000000400ac4a11c: addi    t0,t0,642 # 0x0000000000400282
  0x000000400ac4a120: slli    t0,t0,0xb
  0x000000400ac4a124: addi    t0,t0,948
  0x000000400ac4a128: slli    t0,t0,0x5
  0x000000400ac4a12c: addi    t0,t0,24
  0x000000400ac4a130: slli    t1,t1,0x3
  0x000000400ac4a134: add t0,t0,t1
  0x000000400ac4a138: ld  ra,0(t0)
  0x000000400ac4a13c: mv  t5,sp
  0x000000400ac4a140: sd  s4,-16(s0)
  0x000000400ac4a144: ld  t0,88(t6)
  0x000000400ac4a148: jr  t0
  0x000000400ac4a14c: addi    sp,sp,-240
  0x000000400ac4a150: sd  ra,0(sp)
  0x000000400ac4a154: sd  gp,8(sp)
  0x000000400ac4a158: sd  tp,16(sp)
  0x000000400ac4a15c: sd  t0,24(sp)
  0x000000400ac4a160: sd  t1,32(sp)
  0x000000400ac4a164: sd  t2,40(sp)
  0x000000400ac4a168: sd  s0,48(sp)
  0x000000400ac4a16c: sd  s1,56(sp)
  0x000000400ac4a170: sd  a0,64(sp)
  0x000000400ac4a174: sd  a1,72(sp)
  0x000000400ac4a178: sd  a2,80(sp)
  0x000000400ac4a17c: sd  a3,88(sp)
  0x000000400ac4a180: sd  a4,96(sp)
  0x000000400ac4a184: sd  a5,104(sp)
  0x000000400ac4a188: sd  a6,112(sp)
  0x000000400ac4a18c: sd  a7,120(sp)
  0x000000400ac4a190: sd  s2,128(sp)
  0x000000400ac4a194: sd  s3,136(sp)
  0x000000400ac4a198: sd  s4,144(sp)
  0x000000400ac4a19c: sd  s5,152(sp)
  0x000000400ac4a1a0: sd  s6,160(sp)
  0x000000400ac4a1a4: sd  s7,168(sp)
  0x000000400ac4a1a8: sd  s8,176(sp)
  0x000000400ac4a1ac: sd  s9,184(sp)
  0x000000400ac4a1b0: sd  s10,192(sp)
  0x000000400ac4a1b4: sd  s11,200(sp)
  0x000000400ac4a1b8: sd  t3,208(sp)
  0x000000400ac4a1bc: sd  t4,216(sp)
  0x000000400ac4a1c0: sd  t5,224(sp)
  0x000000400ac4a1c4: sd  t6,232(sp)
  0x000000400ac4a1c8: lui a0,0x2001
  0x000000400ac4a1cc: addiw   a0,a0,649
  0x000000400ac4a1d0: slli    a0,a0,0xd
  0x000000400ac4a1d4: addi    a0,a0,-784 # 0x0000000002000cf0
  0x000000400ac4a1d8: lui a1,0x2005
  0x000000400ac4a1dc: addiw   a1,a1,1573
  0x000000400ac4a1e0: slli    a1,a1,0xd
  0x000000400ac4a1e4: addi    a1,a1,332 # 0x000000000200514c
  0x000000400ac4a1e8: mv  a2,sp
 ;; 0x4002039fea
  0x000000400ac4a1ec: lui a3,0x400
  0x000000400ac4a1f0: addi    a3,a3,515 # 0x0000000000400203
  0x000000400ac4a1f4: slli    a3,a3,0xb
  0x000000400ac4a1f8: addi    a3,a3,1279
  0x000000400ac4a1fc: slli    a3,a3,0x5
  0x000000400ac4a200: addi    a3,a3,10
  0x000000400ac4a204: jalr    a3
  0x000000400ac4a208: ebreak
  0x000000400ac4a20c: nop
  0x000000400ac4a210: sw  a1,28(s1)
  0x000000400ac4a212: sw  a1,28(s1)
  0x000000400ac4a214: sw  a1,28(s1)
  0x000000400ac4a216: sw  a1,28(s1)
  0x000000400ac4a218: sw  a1,28(s1)
  0x000000400ac4a21a: sw  a1,28(s1)
  0x000000400ac4a21c: sw  a1,28(s1)
  0x000000400ac4a21e: sw  a1,28(s1)
  0x000000400ac4a220: sw  a1,28(s1)
  0x000000400ac4a222: sw  a1,28(s1)
  0x000000400ac4a224: sw  a1,28(s1)
  0x000000400ac4a226: sw  a1,28(s1)
  0x000000400ac4a228: sw  a1,28(s1)
  0x000000400ac4a22a: sw  a1,28(s1)
  0x000000400ac4a22c: sw  a1,28(s1)
  0x000000400ac4a22e: sw  a1,28(s1)
  0x000000400ac4a230: sw  a1,28(s1)
  0x000000400ac4a232: sw  a1,28(s1)
  0x000000400ac4a234: sw  a1,28(s1)
  0x000000400ac4a236: sw  a1,28(s1)
  0x000000400ac4a238: sw  a1,28(s1)
  0x000000400ac4a23a: sw  a1,28(s1)
  0x000000400ac4a23c: sw  a1,28(s1)
  0x000000400ac4a23e: sw  a1,28(s1)

rv32:

----------------------------------------------------------------------
invokestatic  184 invokestatic  [0x3ccfc840, 0x3ccfcb00]  704 bytes

BFD: unrecognized disassembler option: 
  0x3ccfc840: addi    s4,s4,-4
  0x3ccfc844: sw  a0,0(s4)
  0x3ccfc848: j   0x3ccfc880
  0x3ccfc84c: addi    s4,s4,-4
  0x3ccfc850: fsw fa0,0(s4)
  0x3ccfc854: j   0x3ccfc880
  0x3ccfc858: addi    s4,s4,-8
  0x3ccfc85c: fsd fa0,0(s4)
  0x3ccfc860: j   0x3ccfc880
  0x3ccfc864: addi    s4,s4,-8
  0x3ccfc868: sw  zero,4(s4)
  0x3ccfc86c: sw  a0,0(s4)
  0x3ccfc870: j   0x3ccfc880
  0x3ccfc874: addi    s4,s4,-4
  0x3ccfc878: add a0,a0,zero
  0x3ccfc87c: sw  a0,0(s4)
  0x3ccfc880: addi    s4,s4,-4
  0x3ccfc884: sw  t0,0(s4)
  0x3ccfc888: addi    s4,s4,-4
  0x3ccfc88c: sw  a0,0(s4)
  0x3ccfc890: lui a0,0x3fe21
  0x3ccfc894: addi    a0,a0,392 # 0x3fe21188
  0x3ccfc898: lui t0,0x0
  0x3ccfc89c: addi    t0,t0,1 # 1
  0x3ccfc8a0: amoadd.w.aqrl   zero,t0,(a0)
  0x3ccfc8a4: lw  a0,0(s4)
  0x3ccfc8a8: addi    s4,s4,4
  0x3ccfc8ac: lw  t0,0(s4)
  0x3ccfc8b0: addi    s4,s4,4
  0x3ccfc8b4: jal ra,0x3ccdf998
  0x3ccfc8b8: sw  s6,-36(s0)
  0x3ccfc8bc: lhu a4,1(s6)
  0x3ccfc8c0: slli    t1,a4,0x5
  0x3ccfc8c4: add t1,s10,t1
  0x3ccfc8c8: addi    s1,t1,16
  0x3ccfc8cc: mv  s1,s1
  0x3ccfc8d0: fence
  0x3ccfc8d4: lw  s1,0(s1)
  0x3ccfc8d8: fence   ir,iorw
  0x3ccfc8dc: slli    s1,s1,0x8
  0x3ccfc8e0: srli    s1,s1,0x18
  0x3ccfc8e4: lui t0,0x0
  0x3ccfc8e8: addi    t0,t0,184 # 0x000000b8
  0x3ccfc8ec: beq s1,t0,0x3ccfca18
  0x3ccfc8f0: lui s1,0x0
  0x3ccfc8f4: addi    s1,s1,184 # 0x000000b8
  0x3ccfc8f8: mv  a1,s1
  0x3ccfc8fc: sw  s6,-36(s0)
  0x3ccfc900: lw  t0,-8(s0)
  0x3ccfc904: beqz    t0,0x3ccfc9a8
  0x3ccfc908: addi    sp,sp,-120
  0x3ccfc90c: sw  ra,0(sp)
  0x3ccfc910: sw  gp,4(sp)
  0x3ccfc914: sw  tp,8(sp)
  0x3ccfc918: sw  t0,12(sp)
  0x3ccfc91c: sw  t1,16(sp)
  0x3ccfc920: sw  t2,20(sp)
  0x3ccfc924: sw  s0,24(sp)
  0x3ccfc928: sw  s1,28(sp)
  0x3ccfc92c: sw  a0,32(sp)
  0x3ccfc930: sw  a1,36(sp)
  0x3ccfc934: sw  a2,40(sp)
  0x3ccfc938: sw  a3,44(sp)
  0x3ccfc93c: sw  a4,48(sp)
  0x3ccfc940: sw  a5,52(sp)
  0x3ccfc944: sw  a6,56(sp)
  0x3ccfc948: sw  a7,60(sp)
  0x3ccfc94c: sw  s2,64(sp)
  0x3ccfc950: sw  s3,68(sp)
  0x3ccfc954: sw  s4,72(sp)
  0x3ccfc958: sw  s5,76(sp)
  0x3ccfc95c: sw  s6,80(sp)
  0x3ccfc960: sw  s7,84(sp)
  0x3ccfc964: sw  s8,88(sp)
  0x3ccfc968: sw  s9,92(sp)
  0x3ccfc96c: sw  s10,96(sp)
  0x3ccfc970: sw  s11,100(sp)
  0x3ccfc974: sw  t3,104(sp)
  0x3ccfc978: sw  t4,108(sp)
  0x3ccfc97c: sw  t5,112(sp)
  0x3ccfc980: sw  t6,116(sp)
  0x3ccfc984: lui a0,0x3fbed
  0x3ccfc988: addi    a0,a0,-1072 # 0x3fbecbd0
  0x3ccfc98c: lui a1,0x3ccfd
  0x3ccfc990: addi    a1,a1,-1784 # 0x3ccfc908
  0x3ccfc994: mv  a2,sp
  0x3ccfc998: lui a3,0x3f765
  0x3ccfc99c: addi    a3,a3,1458 # 0x3f7655b2 = MacroAssembler::debug32(char*, int, int*)
  0x3ccfc9a0: jalr    a3
  0x3ccfc9a4: ebreak
  0x3ccfc9a8: mv  a0,s7
  0x3ccfc9ac: auipc   t0,0x0
  0x3ccfc9b0: addi    t0,t0,40 # 0x3ccfc9d4
  0x3ccfc9b4: sw  t0,628(s7)
  0x3ccfc9b8: sw  s4,624(s7)
  0x3ccfc9bc: sw  s0,632(s7)
  0x3ccfc9c0: addi    sp,sp,-8
  0x3ccfc9c4: sw  t1,0(sp)
  0x3ccfc9c8: sw  t6,4(sp)
  0x3ccfc9cc: lui t0,0x3f546
  0x3ccfc9d0: jalr    -1136(t0) # 0x3f545b90 = InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)
  0x3ccfc9d4: lw  t1,0(sp)
  0x3ccfc9d8: lw  t6,4(sp)
  0x3ccfc9dc: addi    sp,sp,8
  0x3ccfc9e0: fence.i
  0x3ccfc9e4: fence   ir,ir
  0x3ccfc9e8: sw  zero,624(s7)
  0x3ccfc9ec: sw  zero,632(s7)
  0x3ccfc9f0: sw  zero,628(s7)
  0x3ccfc9f4: lw  t0,4(s7)
  0x3ccfc9f8: beqz    t0,0x3ccfca04
  0x3ccfc9fc: auipc   t0,0xfffdd
  0x3ccfca00: jr  -1788(t0) # 0x3ccd9300
  0x3ccfca04: lw  s6,-36(s0)
  0x3ccfca08: lw  s8,-32(s0)
  0x3ccfca0c: lhu a4,1(s6)
  0x3ccfca10: slli    t1,a4,0x5
  0x3ccfca14: add t1,s10,t1
  0x3ccfca18: lw  t6,20(t1)
  0x3ccfca1c: lw  a3,28(t1)
  0x3ccfca20: slli    t1,a3,0x0
  0x3ccfca24: srli    t1,t1,0x1c
  0x3ccfca28: lui t0,0x3fe92
  0x3ccfca2c: addi    t0,t0,-1100 # 0x3fe91bb4
  0x3ccfca30: slli    t1,t1,0x3
  0x3ccfca34: add t0,t0,t1
  0x3ccfca38: lw  ra,0(t0)
  0x3ccfca3c: mv  t5,sp
  0x3ccfca40: sw  s4,-8(s0)
  0x3ccfca44: lw  t0,52(t6)
  0x3ccfca48: jr  t0
  0x3ccfca4c: addi    sp,sp,-120
  0x3ccfca50: sw  ra,0(sp)
  0x3ccfca54: sw  gp,4(sp)
  0x3ccfca58: sw  tp,8(sp)
  0x3ccfca5c: sw  t0,12(sp)
  0x3ccfca60: sw  t1,16(sp)
  0x3ccfca64: sw  t2,20(sp)
  0x3ccfca68: sw  s0,24(sp)
  0x3ccfca6c: sw  s1,28(sp)
  0x3ccfca70: sw  a0,32(sp)
  0x3ccfca74: sw  a1,36(sp)
  0x3ccfca78: sw  a2,40(sp)
  0x3ccfca7c: sw  a3,44(sp)
  0x3ccfca80: sw  a4,48(sp)
  0x3ccfca84: sw  a5,52(sp)
  0x3ccfca88: sw  a6,56(sp)
  0x3ccfca8c: sw  a7,60(sp)
  0x3ccfca90: sw  s2,64(sp)
  0x3ccfca94: sw  s3,68(sp)
  0x3ccfca98: sw  s4,72(sp)
  0x3ccfca9c: sw  s5,76(sp)
  0x3ccfcaa0: sw  s6,80(sp)
  0x3ccfcaa4: sw  s7,84(sp)
  0x3ccfcaa8: sw  s8,88(sp)
  0x3ccfcaac: sw  s9,92(sp)
  0x3ccfcab0: sw  s10,96(sp)
  0x3ccfcab4: sw  s11,100(sp)
  0x3ccfcab8: sw  t3,104(sp)
  0x3ccfcabc: sw  t4,108(sp)
  0x3ccfcac0: sw  t5,112(sp)
  0x3ccfcac4: sw  t6,116(sp)
  0x3ccfcac8: lui a0,0x3fbec
  0x3ccfcacc: addi    a0,a0,1524 # 0x3fbec5f4
  0x3ccfcad0: lui a1,0x3ccfd
  0x3ccfcad4: addi    a1,a1,-1460 # 0x3ccfca4c
  0x3ccfcad8: mv  a2,sp
  0x3ccfcadc: lui a3,0x3f765
  0x3ccfcae0: addi    a3,a3,1458 # 0x3f7655b2 = MacroAssembler::debug32(char*, int, int*)
  0x3ccfcae4: jalr    a3
  0x3ccfcae8: ebreak
  0x3ccfcaec: sw  a1,28(s1)
  0x3ccfcaee: sw  a1,28(s1)
  0x3ccfcaf0: sw  a1,28(s1)
  0x3ccfcaf2: sw  a1,28(s1)
  0x3ccfcaf4: sw  a1,28(s1)
  0x3ccfcaf6: sw  a1,28(s1)
  0x3ccfcaf8: sw  a1,28(s1)
  0x3ccfcafa: sw  a1,28(s1)
  0x3ccfcafc: sw  a1,28(s1)
  0x3ccfcafe: sw  a1,28(s1)

接下来准备分析以及对比字节码returnlconst_0

对比了一下两段指令: rv32主要是多了这样一段指令 643aeb439bb0006dad7604d6a572277 其中部分涉及到了原子指令

DingliZhang commented 3 years ago

https://github.com/openjdk-riscv/jdk11u/issues/161#issuecomment-902401817 之前使用hsdis反汇编生成字节码对比BishengJDK的时候没有控制好变量,在生成rv32的时候多加了-XX:+TraceBytecodes,而这里会在TemplateInterpreterGenerator::generate_and_dispatch(Template* t, TosState tos_out)中的trace_bytecode(t);中生成一些指令,导致rv32的反汇编代码中出现原子指令。 使用的命令:

qemu32 /home/dingli/isrc-jdk11u/jdk11u/build/linux-riscv32-normal-core-slowdebug/jdk/bin/java -XX:+PrintInterpreter  -version >  PrintInterpreter.log 2>&1

rv32反汇编代码:https://paste.ubuntu.com/p/z9W7CwQ6j3/ bishengJDK反汇编代码:https://paste.ubuntu.com/p/ZBNQb3c5T8/

axiangyushanhaijing commented 3 years ago

#161 (comment) 之前使用hsdis反汇编生成字节码对比BishengJDK的时候没有控制好变量,在生成rv32的时候多加了-XX:+TraceBytecodes,而这里会在TemplateInterpreterGenerator::generate_and_dispatch(Template* t, TosState tos_out)中的trace_bytecode(t);中生成一些指令,导致rv32的反汇编代码中出现原子指令。 使用的命令:

qemu32 /home/dingli/isrc-jdk11u/jdk11u/build/linux-riscv32-normal-core-slowdebug/jdk/bin/java -XX:+PrintInterpreter  -version >  PrintInterpreter.log 2>&1

rv32反汇编代码:https://paste.ubuntu.com/p/z9W7CwQ6j3/ bishengJDK反汇编代码:https://paste.ubuntu.com/p/ZBNQb3c5T8/

对比了两者的反汇编代码,主要区别在于 1d0b95ff19e37955ebbeb5aab20b3aa da0de016679b373a710f475982d2312 总结了一下,bisheng的反汇编代码在lui之后,总会多出一步“slli”,除此之外,两者的反汇编代码没有什么有意义的区别,暂时不能确定这个“slli”是否是导致两者差异的原因。

DingliZhang commented 3 years ago

梳理了一下RV32下JVM字节码反汇编指令分析 :https://zhuanlan.zhihu.com/p/407297763 这里也解释了 https://github.com/openjdk-riscv/jdk11u/issues/161#issuecomment-904569837 8月26更新中所提到的“无法调用到dispatch_next()函数”的问题。

axiangyushanhaijing commented 3 years ago

梳理了一下RV32下JVM字节码反汇编指令分析 :https://zhuanlan.zhihu.com/p/407297763 这里也解释了 #161 (comment) 8月26更新中所提到的“无法调用到dispatch_next()函数”的问题。

补充一下invokestatic具体的调试信息,验证了 @DingliZhang 知乎文章的解释,invokestatic的t->does_dispatch()是true,所以不会进行dispatch。


(gdb) 
Continuing.

Thread 1 hit Breakpoint 2, TemplateInterpreterGenerator::generate_and_dispatch (this=0x3f0b2b20, t=0x3fe8975c <TemplateTable::_template_table+3680>, tos_out=ilgl)
    at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/templateInterpreterGenerator.cpp:369
369       if (PrintBytecodeHistogram)                                    histogram_bytecode(t);
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) si
0x3f94e458      369       if (PrintBytecodeHistogram)                                    histogram_bytecode(t);
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) n
372       if (CountBytecodes || TraceBytecodes || StopInterpreterAt > 0) count_bytecode();
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) 
373       if (PrintBytecodePairHistogram)                                histogram_bytecode_pair(t);
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) n
374       if (TraceBytecodes)                                            trace_bytecode(t);
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) p *t
$1 = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) n
375       if (StopInterpreterAt > 0)                                     stop_interpreter_at();
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) 
376       __ verify_FPU(1, t->tos_in());
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) 
378       int step = 0;
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) 
379       if (!t->does_dispatch()) {
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) 
392       t->generate(_masm);
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) p t->does_dispatch()
$2 = true
(gdb) s
Template::generate (this=0x3fe8975c <TemplateTable::_template_table+3680>, masm=0x3ed1a1d8) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/templateTable.cpp:60
60        TemplateTable::_desc = this;
(gdb) p masm
$3 = (InterpreterMacroAssembler *) 0x3ed1a1d8
(gdb) p *masm
$4 = {<MacroAssembler> = {<Assembler> = {<AbstractAssembler> = {<ResourceObj> = {<AllocatedObj> = {_vptr.AllocatedObj = 0x3fdef298 <vtable for InterpreterMacroAssembler+8>}, _allocation_t = {3241041446, 0}}, 
        _code_section = 0x3f0b2838, _oop_recorder = 0x3f0b28a4, _short_branch_delta = 0}, static branch_range = 1048576}, static zero_words_block_size = 8}, <No data fields>}
(gdb) n
61        TemplateTable::_masm = masm;
(gdb) 
63        _gen(_arg);
(gdb) s
TemplateTable::invokestatic (byte_no=1) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/templateTable_riscv32.cpp:3428
3428      transition(vtos, vtos);
(gdb) n
3429      assert(byte_no == f1_byte, "use this arugment");
(gdb) 
3431      prepare_invoke(byte_no, xmethod);  // get f1 Method*
(gdb) n
3433      __ profile_call(x10);
(gdb) s
InterpreterMacroAssembler::profile_call (this=0x3ed1a1d8, mdp=0xa) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/interp_masm_riscv32.cpp:1131
1131      if (ProfileInterpreter) {
(gdb) n
1144    }
(gdb) 
TemplateTable::invokestatic (byte_no=1) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/templateTable_riscv32.cpp:3434
3434      __ profile_arguments_type(x10, xmethod, x14, false);
(gdb) s
InterpreterMacroAssembler::profile_arguments_type (this=0x3ed1a1d8, mdp=0xa, callee=0x1f, tmp=0xe, is_virtual=false) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/interp_masm_riscv32.cpp:1687
1687      if (!ProfileInterpreter) {
(gdb) n
1688        return;
(gdb) 
1813    }
(gdb) 
TemplateTable::invokestatic (byte_no=1) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/templateTable_riscv32.cpp:3435
3435      __ jump_from_interpreted(xmethod);
(gdb) s
InterpreterMacroAssembler::jump_from_interpreted (this=0x3ed1a1d8, method=0x1f) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/interp_masm_riscv32.cpp:470
470       prepare_to_jump_from_interpreted();
(gdb) n
471       if (JvmtiExport::can_post_interpreter_events()) {
(gdb) 
483       lw(t0, Address(method, Method::from_interpreted_offset()));
(gdb) 
484       jr(t0);
(gdb) 
485     }
(gdb) 
TemplateTable::invokestatic (byte_no=1) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/cpu/riscv32/templateTable_riscv32.cpp:3436
3436    }
(gdb) 
Template::generate (this=0x3fe8975c <TemplateTable::_template_table+3680>, masm=0x3ed1a1d8) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/templateTable.cpp:64
64        masm->flush();
(gdb) s
AbstractAssembler::flush (this=0x3ed1a1d8) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/asm/assembler.cpp:108
108       ICache::invalidate_range(addr_at(0), offset());
(gdb) s
AbstractAssembler::addr_at (this=0x3ed1a1d8, pos=0) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/asm/assembler.hpp:214
214       address addr_at(int pos) const { return code_section()->start() + pos; }
(gdb) p pos
$5 = 0
(gdb) p code_section()
$6 = (CodeSection *) 0x3f0b2838
(gdb) n
AbstractAssembler::flush (this=0x3ed1a1d8) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/asm/assembler.cpp:109
109     }
(gdb) 
Template::generate (this=0x3fe8975c <TemplateTable::_template_table+3680>, masm=0x3ed1a1d8) at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/templateTable.cpp:65
65      }
(gdb) 
TemplateInterpreterGenerator::generate_and_dispatch (this=0x3f0b2b20, t=0x3fe8975c <TemplateTable::_template_table+3680>, tos_out=ilgl)
    at /home/zhangxiang/jdk/rv32gdev/zx-jdk11u/src/hotspot/share/interpreter/templateInterpreterGenerator.cpp:394
394       if (t->does_dispatch()) {
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) n
397         __ should_not_reach_here();
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) n
403     }
1: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
2: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
3: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
4: *t = {_flags = 7, _tos_in = vtos, _tos_out = vtos, _gen = 0x3f96dfa8 <TemplateTable::invokestatic(int)>, _arg = 1}
(gdb) 
zifeihan commented 3 years ago

@DingliZhang @zhangxiang-plct 通过查看invokestatic这个指令的实现逻辑,发现会调用到TemplateTable::prepare_invoke,在这个函数中会调用 TemplateInterpreterGenerator::generate_return_entry_for来获取return_entry address,并入栈,当被调用的方法返回时,会弹出return_entry address,并进入该函数,并在该函数中做转发跳转的。

zifeihan commented 3 years ago

今天缩小了一下我们的排查问题范围,我们在打印执行的字节码时,发现执行了invokestatic指令,其实这个指令有可能没有发生到真正调用(可能执行了一半,还没有调用真实的 invokestatic Method registerNatives:()V),如何确定呢:因为他调用的是一个C++ native方法,如果调用到了这个native方法,那就能debug这个方法。我在bisheng上进行断点,确实进入到了 jni.cpp 里面的jni_RegisterNatives 方法,但是在jdk11u 32位版本上没有进入到这个方法。

zifeihan commented 3 years ago

今天缩小了一下我们的排查问题范围,我们在打印执行的字节码时,发现执行了invokestatic指令,其实这个指令有可能没有发生到真正调用(可能执行了一半,还没有调用真实的 invokestatic Method registerNatives:()V),如何确定呢:因为他调用的是一个C++ native方法,如果调用到了这个native方法,那就能debug这个方法。我在bisheng上进行断点,确实进入到了 jni.cpp 里面的jni_RegisterNatives 方法,但是在jdk11u 32位版本上没有进入到这个方法。

我们可以通过如下方式复现: 执行java -XX:+TraceBytecodes -version,在 jni.cpp 的jni_RegisterNatives函数打上断点。 正常情况下,在输出结果 [13567] 1 0 invokestatic 17 <java/lang/Object.registerNatives()V> 之后,应该会走到 jni_RegisterNatives 断点。

在bisheng-jdk11上使用如下命令启动并debug(注意修改工具路径):

/usr/local/plct/qemu/bin/qemu-riscv64 -L /usr/local/plct/riscv/sysroot -g 33334 ./java -XX:+TraceBytecodes -version

image

在jdk11u-32g上使用如下命令启动并debug(注意修改工具路径):

/usr/local/plct/qemu/bin/qemu-riscv32 -L /usr/local/plct/riscv32/sysroot -g 33334 ./java -XX:+TraceBytecodes  -version

问题:在jdk11u-32g 并没有进入到 registerNatives 方法。

zifeihan commented 3 years ago

思路:类似 https://github.com/openjdk-riscv/jdk11u/issues/161#issuecomment-927888113,我们可以通过invokestatic 的实现代码看到,会调用 prepare_invoke 函数为其生成一部分机器码,然而在 prepare_invoke 函数的调用中,一定是会调用到 resolve_cache_and_index 这个函数,在干函数中会生成调用 InterpreterRuntime::resolve_from_cache 函数的机器码,在此函数中InterpreterRuntime::resolve_from_cache 会根据 invokestatic 指令的操作数,来解析常量池,但是目前调试下来,发现 RV32 没有进入到 InterpreterRuntime::resolve_from_cache 函数。因此,我们可以进一步确认在发生 InterpreterRuntime::resolve_from_cache 调用之前除了问题。 RV64 :RV64 在发生打印日志 [99827] 1 0 invokestatic 16 <java/lang/Object.registerNatives()V> 后调用了 InterpreterRuntime::resolve_from_cache ,是符合预期。 image RV32:RV32 在发生打印日志 [25169] 1 0 invokestatic 16 <java/lang/Object.registerNatives()V> 没有调用InterpreterRuntime::resolve_from_cache ,不符合预期。

axiangyushanhaijing commented 3 years ago

思路:类似 https://github.com/openjdk-riscv/jdk11u/issues/161#issuecomment-927888113,我们可以通过invokestatic 的实现代码看到,会调用 prepare_invoke 函数为其生成一部分机器码,然而在 prepare_invoke 函数的调用中,一定是会调用到 resolve_cache_and_index 这个函数,在干函数中会生成调用 InterpreterRuntime::resolve_from_cache 函数的机器码,在此函数中InterpreterRuntime::resolve_from_cache 会根据 invokestatic 指令的操作数,来解析常量池,但是目前调试下来,发现 RV32 没有进入到 InterpreterRuntime::resolve_from_cache 函数。因此,我们可以进一步确认在发生 InterpreterRuntime::resolve_from_cache 调用之前除了问题。 RV64 :RV64 在发生打印日志 [99827] 1 0 invokestatic 16 <java/lang/Object.registerNatives()V> 后调用了 InterpreterRuntime::resolve_from_cache ,是符合预期。 image RV32:RV32 在发生打印日志 [25169] 1 0 invokestatic 16 <java/lang/Object.registerNatives()V> 没有调用InterpreterRuntime::resolve_from_cache ,不符合预期。

同时可以参考关于resolve_ldc的一个调试记录,我们根据报错的信息定位到这个指令出错了,根据 @zifeihan 以及 @DingliZhang 的定位信息可以发现,两者的定位是非常相似的位置,只不过方法不一样。但是导致一个问题:到底哪个指令的生成先出错呢?

axiangyushanhaijing commented 3 years ago

思路:类似 https://github.com/openjdk-riscv/jdk11u/issues/161#issuecomment-927888113,我们可以通过invokestatic 的实现代码看到,会调用 prepare_invoke 函数为其生成一部分机器码,然而在 prepare_invoke 函数的调用中,一定是会调用到 resolve_cache_and_index 这个函数,在干函数中会生成调用 InterpreterRuntime::resolve_from_cache 函数的机器码,在此函数中InterpreterRuntime::resolve_from_cache 会根据 invokestatic 指令的操作数,来解析常量池,但是目前调试下来,发现 RV32 没有进入到 InterpreterRuntime::resolve_from_cache 函数。因此,我们可以进一步确认在发生 InterpreterRuntime::resolve_from_cache 调用之前除了问题。 RV64 :RV64 在发生打印日志 [99827] 1 0 invokestatic 16 <java/lang/Object.registerNatives()V> 后调用了 InterpreterRuntime::resolve_from_cache ,是符合预期。 image RV32:RV32 在发生打印日志 [25169] 1 0 invokestatic 16 <java/lang/Object.registerNatives()V> 没有调用InterpreterRuntime::resolve_from_cache ,不符合预期。

按照 @zifeihan 的调试思路,归纳处当前关于invokestatic 指令的生成路径: TemplateTable::invokestatic-->prepare_invoke->TemplateTable::load_invoke_cp_cache_entry->TemplateTable::resolve_cache_and_index,最后在resolve_cache_and_index函数中有个address entry = CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_from_cache),是resolve_from_cache函数指针的返回地址,对比了rv32与毕昇jdk,rv32确实是无法进入IRT_ENTRY(void, InterpreterRuntime::resolve_from_cache(JavaThread* thread, Bytecodes::Code bytecode))(src/hotspot/share/interpreter/interpreterRuntime.cpp),依照 @zifeihan 的猜测,对照梳理了TemplateTable::invokestatic-->prepare_invoke->TemplateTable::load_invoke_cp_cache_entry->TemplateTable::resolve_cache_and_index这段路径的变量关系,并且按照1:2的对应关系对变量进行了手动赋值修改

image

重新观察整个调试路径,无明显变化


9.30更新 调试过程中发现 利用gdb打断点 Thread 1 hit Breakpoint 3, TemplateTable::invokestatic (byte_no=1) at /jdk11u/src/hotspot/cpu/riscv32/templateTable_riscv32.cpp:3435 接着进入 prepare_invoke(byte_no, xmethod)

发现毕昇jdk跟rv32在进入之后,变量情况有明显差异 image

这是多次进行调试之后对比情况,毕昇jdk的变量值一直稳定,而rv32的code值是个随机值,其他的变量值也不一样。引发了问题:为什么这个code值会有这种差异,bytecode之前就生成出错了吗?

zifeihan commented 3 years ago

重点调试下面这段代码,需要搞清楚这些汇编码的意思。看注释其中生成了调用 InterpreterRuntime::resolve_from_cache ,但是并未调用到这个地方,一定是前面在 InterpreterRuntime::resolve_from_cache 调用前某个环节除了问题,我再调试看看。

----------------------------------------------------------------------
invokestatic  184 invokestatic  [0x3ccf0840, 0x3ccf0ac0]  640 bytes

BFD: unrecognized disassembler option: 
  0x3ccf0840: addi  s4,s4,-4
  0x3ccf0844: sw    a0,0(s4)
  0x3ccf0848: j 0x3ccf0880
  0x3ccf084c: addi  s4,s4,-4
  0x3ccf0850: fsw   fa0,0(s4)
  0x3ccf0854: j 0x3ccf0880
  0x3ccf0858: addi  s4,s4,-8
  0x3ccf085c: fsd   fa0,0(s4)
  0x3ccf0860: j 0x3ccf0880
  0x3ccf0864: addi  s4,s4,-8
  0x3ccf0868: sw    zero,4(s4)
  0x3ccf086c: sw    a0,0(s4)
  0x3ccf0870: j 0x3ccf0880
  0x3ccf0874: addi  s4,s4,-4
  0x3ccf0878: add   a0,a0,zero
  0x3ccf087c: sw    a0,0(s4)

  0x3ccf0880: sw    s6,-36(s0)
  0x3ccf0884: lhu   a4,1(s6)
  0x3ccf0888: slli  t1,a4,0x5
  0x3ccf088c: add   t1,s10,t1
  0x3ccf0890: addi  s1,t1,16
  0x3ccf0894: mv    s1,s1
  0x3ccf0898: fence
  0x3ccf089c: lw    s1,0(s1)
  0x3ccf08a0: fence ir,iorw
  0x3ccf08a4: slli  s1,s1,0x8
  0x3ccf08a8: srli  s1,s1,0x18
  0x3ccf08ac: lui   t0,0x0
  0x3ccf08b0: addi  t0,t0,184 # 0x000000b8
  0x3ccf08b4: beq   s1,t0,0x3ccf09e0
  0x3ccf08b8: lui   s1,0x0
  0x3ccf08bc: addi  s1,s1,184 # 0x000000b8
  0x3ccf08c0: mv    a1,s1
  0x3ccf08c4: sw    s6,-36(s0)
  0x3ccf08c8: lw    t0,-8(s0)
  0x3ccf08cc: beqz  t0,0x3ccf0970
  0x3ccf08d0: addi  sp,sp,-120
  0x3ccf08d4: sw    ra,0(sp)
  0x3ccf08d8: sw    gp,4(sp)
  0x3ccf08dc: sw    tp,8(sp)
  0x3ccf08e0: sw    t0,12(sp)
  0x3ccf08e4: sw    t1,16(sp)
  0x3ccf08e8: sw    t2,20(sp)
  0x3ccf08ec: sw    s0,24(sp)
  0x3ccf08f0: sw    s1,28(sp)
  0x3ccf08f4: sw    a0,32(sp)
  0x3ccf08f8: sw    a1,36(sp)
  0x3ccf08fc: sw    a2,40(sp)
  0x3ccf0900: sw    a3,44(sp)
  0x3ccf0904: sw    a4,48(sp)
  0x3ccf0908: sw    a5,52(sp)
  0x3ccf090c: sw    a6,56(sp)
  0x3ccf0910: sw    a7,60(sp)
  0x3ccf0914: sw    s2,64(sp)
  0x3ccf0918: sw    s3,68(sp)
  0x3ccf091c: sw    s4,72(sp)
  0x3ccf0920: sw    s5,76(sp)
  0x3ccf0924: sw    s6,80(sp)
  0x3ccf0928: sw    s7,84(sp)
  0x3ccf092c: sw    s8,88(sp)
  0x3ccf0930: sw    s9,92(sp)
  0x3ccf0934: sw    s10,96(sp)
  0x3ccf0938: sw    s11,100(sp)
  0x3ccf093c: sw    t3,104(sp)
  0x3ccf0940: sw    t4,108(sp)
  0x3ccf0944: sw    t5,112(sp)
  0x3ccf0948: sw    t6,116(sp)
  0x3ccf094c: lui   a0,0x3fbeb
  0x3ccf0950: addi  a0,a0,-1040 # 0x3fbeabf0
  0x3ccf0954: lui   a1,0x3ccf1
  0x3ccf0958: addi  a1,a1,-1840 # 0x3ccf08d0
  0x3ccf095c: mv    a2,sp
  0x3ccf0960: lui   a3,0x3f763
  0x3ccf0964: addi  a3,a3,1486 # 0x3f7635ce = MacroAssembler::debug32(char*, int, int*)
  0x3ccf0968: jalr  a3
  0x3ccf096c: ebreak
  0x3ccf0970: mv    a0,s7
  0x3ccf0974: auipc t0,0x0
  0x3ccf0978: addi  t0,t0,40 # 0x3ccf099c
  0x3ccf097c: sw    t0,628(s7)
  0x3ccf0980: sw    s4,624(s7)
  0x3ccf0984: sw    s0,632(s7)
  0x3ccf0988: addi  sp,sp,-8
  0x3ccf098c: sw    t1,0(sp)
  0x3ccf0990: sw    t6,4(sp)
  0x3ccf0994: lui   t0,0x3f544
  0x3ccf0998: jalr  -1108(t0) # 0x3f543bac = InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)
  0x3ccf099c: lw    t1,0(sp)
  0x3ccf09a0: lw    t6,4(sp)
  0x3ccf09a4: addi  sp,sp,8
  0x3ccf09a8: fence.i
  0x3ccf09ac: fence ir,ir
  0x3ccf09b0: sw    zero,624(s7)
  0x3ccf09b4: sw    zero,632(s7)
  0x3ccf09b8: sw    zero,628(s7)
  0x3ccf09bc: lw    t0,4(s7)
  0x3ccf09c0: beqz  t0,0x3ccf09cc
  0x3ccf09c4: auipc t0,0xfffe1
  0x3ccf09c8: jr    -1732(t0) # 0x3ccd1300
  0x3ccf09cc: lw    s6,-36(s0)
  0x3ccf09d0: lw    s8,-32(s0)
  0x3ccf09d4: lhu   a4,1(s6)
  0x3ccf09d8: slli  t1,a4,0x5
  0x3ccf09dc: add   t1,s10,t1
  0x3ccf09e0: lw    t6,20(t1)
  0x3ccf09e4: lw    a3,28(t1)
  0x3ccf09e8: slli  t1,a3,0x0
  0x3ccf09ec: srli  t1,t1,0x1c
  0x3ccf09f0: lui   t0,0x3fe90
  0x3ccf09f4: addi  t0,t0,-1100 # 0x3fe8fbb4
  0x3ccf09f8: slli  t1,t1,0x3
  0x3ccf09fc: add   t0,t0,t1
  0x3ccf0a00: lw    ra,0(t0)
  0x3ccf0a04: mv    t5,sp
  0x3ccf0a08: sw    s4,-8(s0)
  0x3ccf0a0c: lw    t0,52(t6)
  0x3ccf0a10: jr    t0
  0x3ccf0a14: addi  sp,sp,-120
  0x3ccf0a18: sw    ra,0(sp)
  0x3ccf0a1c: sw    gp,4(sp)
  0x3ccf0a20: sw    tp,8(sp)
  0x3ccf0a24: sw    t0,12(sp)
  0x3ccf0a28: sw    t1,16(sp)
  0x3ccf0a2c: sw    t2,20(sp)
  0x3ccf0a30: sw    s0,24(sp)
  0x3ccf0a34: sw    s1,28(sp)
  0x3ccf0a38: sw    a0,32(sp)
  0x3ccf0a3c: sw    a1,36(sp)
  0x3ccf0a40: sw    a2,40(sp)
  0x3ccf0a44: sw    a3,44(sp)
  0x3ccf0a48: sw    a4,48(sp)
  0x3ccf0a4c: sw    a5,52(sp)
  0x3ccf0a50: sw    a6,56(sp)
  0x3ccf0a54: sw    a7,60(sp)
  0x3ccf0a58: sw    s2,64(sp)
  0x3ccf0a5c: sw    s3,68(sp)
  0x3ccf0a60: sw    s4,72(sp)
  0x3ccf0a64: sw    s5,76(sp)
  0x3ccf0a68: sw    s6,80(sp)
  0x3ccf0a6c: sw    s7,84(sp)
  0x3ccf0a70: sw    s8,88(sp)
  0x3ccf0a74: sw    s9,92(sp)
  0x3ccf0a78: sw    s10,96(sp)
  0x3ccf0a7c: sw    s11,100(sp)
  0x3ccf0a80: sw    t3,104(sp)
  0x3ccf0a84: sw    t4,108(sp)
  0x3ccf0a88: sw    t5,112(sp)
  0x3ccf0a8c: sw    t6,116(sp)
  0x3ccf0a90: lui   a0,0x3fbea
  0x3ccf0a94: addi  a0,a0,1556 # 0x3fbea614
  0x3ccf0a98: lui   a1,0x3ccf1
  0x3ccf0a9c: addi  a1,a1,-1516 # 0x3ccf0a14
  0x3ccf0aa0: mv    a2,sp
  0x3ccf0aa4: lui   a3,0x3f763
  0x3ccf0aa8: addi  a3,a3,1486 # 0x3f7635ce = MacroAssembler::debug32(char*, int, int*)
  0x3ccf0aac: jalr  a3
  0x3ccf0ab0: ebreak
  0x3ccf0ab4: sw    a1,28(s1)
  0x3ccf0ab6: sw    a1,28(s1)
  0x3ccf0ab8: sw    a1,28(s1)
  0x3ccf0aba: sw    a1,28(s1)
  0x3ccf0abc: sw    a1,28(s1)
  0x3ccf0abe: sw    a1,28(s1)

----------------------------------------------------------------------
zifeihan commented 3 years ago

我这边报告一个问题,我们在通过 bytecodeTracer.cpp:111 中可以将当前执行的字节码打印出来,打印出来的信息如下:

[31382] static void java.lang.Object.<clinit>()
[31382]        1     0  invokestatic 16 <java/lang/Object.registerNatives()V> 
[31382]        2     1  lconst_0

,这里打印出来的信息显示执行了 invokestatic 字节码指令,但是我们通过另一种方式定位,其实并没有真正执行 invokestatic 的字节码指令片段。 如何确定呢:我们可以将 templateTable_riscv32.cpp 中的 invokestatic 函数的实现注释掉,正常来说,如果注释掉这段代码,那么就不能进行字节码跳转,就会在 invokestatic 执行后报错(已经通过bishengJDK验证),但是在当前项目中,注释掉该代码,还是当前的异常,因此怀疑 invokestatic 生成的机器码片段并未被执行到,有可能在调用 invokestatic 的实现逻辑之前就出了问题。

zifeihan commented 3 years ago

通过分析,发现可能是 zerolocals 例程中的汇编跳转到了不正确的内存位置,写了一个文档用于描述该情况,请帮忙进行验证。文档地址:https://zhuanlan.zhihu.com/p/420382264

axiangyushanhaijing commented 3 years ago

按照 @zifeihan 的思路,尝试着围绕invoke指令的返回值处理以及调用流程对左移指令进行了部分修改,包括若干字符大小的修改 目前能够成功执行出来前两个方法的指令;

修改patch:https://paste.ubuntu.com/p/SQKX9DRMpK/ rv32 8d0f452539ebb4b5bb379710d9808b7

5127d330a0cbfa57b84ce4e617ee1f1

bishengJDK

9abf15761424d12e93989b8aa3ddcad

其中具体的修改原因以及调用逻辑还在整理

axiangyushanhaijing commented 3 years ago

今天缩小了一下我们的排查问题范围,我们在打印执行的字节码时,发现执行了invokestatic指令,其实这个指令有可能没有发生到真正调用(可能执行了一半,还没有调用真实的 invokestatic Method registerNatives:()V),如何确定呢:因为他调用的是一个C++ native方法,如果调用到了这个native方法,那就能debug这个方法。我在bisheng上进行断点,确实进入到了 jni.cpp 里面的jni_RegisterNatives 方法,但是在jdk11u 32位版本上没有进入到这个方法。

进行了上述修改(https://paste.ubuntu.com/p/SQKX9DRMpK/) 之后,现在再进行关于jni_RegisterNatives 方法的断点调试,发现是可以进去的,也就可以证明invokestatic指令是发生了真实调用。

axiangyushanhaijing commented 3 years ago

今天缩小了一下我们的排查问题范围,我们在打印执行的字节码时,发现执行了invokestatic指令,其实这个指令有可能没有发生到真正调用(可能执行了一半,还没有调用真实的 invokestatic Method registerNatives:()V),如何确定呢:因为他调用的是一个C++ native方法,如果调用到了这个native方法,那就能debug这个方法。我在bisheng上进行断点,确实进入到了 jni.cpp 里面的jni_RegisterNatives 方法,但是在jdk11u 32位版本上没有进入到这个方法。

进行了上述修改([https://paste.ubuntu.com/p/SQKX9DRMpK/] (https://paste.ubuntu.com/p/SQKX9DRMpK/%EF%BC%89 ) 之后,现在再进行关于jni_RegisterNatives 方法的断点调试,发现是可以进去的,也就可以证明invokestatic指令是发生了真实调用。

代码修改原因


void InterpreterMacroAssembler::get_cache_and_index_at_bcp(Register cache,
                                                           Register index,
                                                           int bcp_offset,
                                                           size_t index_size) {
  assert_different_registers(cache, index);
  assert_different_registers(cache, xcpool);
  get_cache_index_at_bcp(index, bcp_offset, index_size);
  assert(sizeof(ConstantPoolCacheEntry) == 4 * wordSize, "adjust code below");
  // convert from field index to ConstantPoolCacheEntry
  // riscv64 already has the cache in xcpool so there is no need to
  // install it in cache. instead we pre-add the indexed offset to
  // xcpool and return it in cache. All clients of this method need to
  // be modified accordingly.
  slli(cache, index, 5);
  add(cache, xcpool, cache);

在riscv64中,我们将索引偏移量预先添加到 xcpool 并将其返回到缓存中,考虑到ConstantPoolCacheEntry这个入口类在riscv64中4*WordSize--也就是32的长度,所以index左移了5位。那么在riscv32中,WordSize是4,对应的ConstantPoolCacheEntry它的长度就是16,那么在下面的index偏移的话,也需要修改成4.