ptitSeb / box64

Box64 - Linux Userspace x86_64 Emulator with a twist, targeted at ARM64 Linux devices
https://box86.org
MIT License
3.89k stars 286 forks source link

RISC-V Vector Extension Seems Not Working Well #1818

Closed Coekjan closed 2 months ago

Coekjan commented 2 months ago

Description

I encountered a weird bug when running lstopo (https://archlinux.org/packages/extra/x86_64/hwloc/) with box64 for testing. It raised SIGILL like this:

86920|SIGILL @0x7040b8b85618 (???(0x7040b8b85618)) (x64pc=0x3f020a7073/"/lib/libcairo.so.2 + 0xa7073", rsp=0x7040bb5fa390, stack=0x7040bae00000:0x7040bb600000 own=(nil) fp=0x7040bb5fae30), for accessing 0x7040b8b85618 (code=1/prot=7), db=0x7040b9dfef40(0x7040b8b84f80:0x7040b8b85b48/0x3f020a6f06:0x3f020a7103//lib/libcairo.so.2 + 0xa6f06:clean, hash:89047ac9/89047ac9) handler=(nil)
RAX:0x000000003744f690 RCX:0x0000000000000000 RDX:0x000000003732fa20 RBX:0x0000000037315650 
RSP:0x00007040bb5fa390 RBP:0x00007040bb5fae48 RSI:0x0000000000000042 RDI:0x00007040bb5fa340 
 R8:0x00007040bb5fa440  R9:0x000000003732f830 R10:0x000000003744f6d0 R11:0x0000000000000000 
R12:0x0000000000000001 R13:0x00000000374498d0 R14:0x000000003732fa20 R15:0x00007040bb5fa600 
ES:0x002b CS:0x0033 SS:0x002b DS:0x002b FS:0x0043 GS:0x0053 
RSP-0x20:0x00000000374497e0 RSP-0x18:0x000000003744ad90 RSP-0x10:0x00007040bb5fa600 RSP-0x08:0x0000003f020a6f06
RSP+0x00:0x0000000100000f00 RSP+0x08:0x000000003744f6d0 RSP+0x10:0x000000000000000d RSP+0x18:0x00007040bb5fae30 opcode=27 F5 05 02 27 38 05 0F (0F 84 B7 07 00)
[1]    86920 illegal hardware instruction (core dumped)  ./box64 lstopo

I found that, if I turned off rv64_vector manually, this SIGILL would not occur. So it must be something wrong in box64 with rv64_vector enabled.

Investigation

I did some investigation about this issue, and found that, if I disabled this translation via inserting return 0 after case 0x6C, this bug would not occur.

https://github.com/ptitSeb/box64/blob/fc7f83c60b82e77712165dd600501af2c59b8d3b/src/dynarec/rv64/dynarec_rv64_660f_vector.c#L180-L203

Therefore, I think this instruction (PUNPCKLQDQ) is not processed correctly. However, GDB told me that illegal instruction did not come from this process, but from

https://github.com/ptitSeb/box64/blob/fc7f83c60b82e77712165dd600501af2c59b8d3b/src/dynarec/rv64/dynarec_rv64_helper.c#L1837

Weird...

Want Help

I am not familiar with both x64 mmx and riscv vector extension, and failed to find out where the bug is in the past 2 days, so I am sending this issue and waiting for help/fix.

I checked the history of these code lines, and found PR #1755, #1811 & #1816 are related.

ptitSeb commented 2 months ago

@ksco I assume this is issue is because some Vector state is not globaly handled correcly? leaking SEW state maybe?

ksco commented 2 months ago

Are you testing the latest box64? I'm on x86_64 ArchLinux, it works for me, logs:

[ksco@dwarf build]$ QEMU_LD_PREFIX=/usr/riscv64-linux-gnu/ QEMU_CPU=rv64,v=true,vlen=128,vext_spec=v1.0 INTERPRETER=qemu-riscv64-static BOX64_LD_LIBRARY_PATH=../x64lib/:/usr/lib ./box64 /usr/bin/lstopo
Dynarec for RISC-V With extension: I M A F D C Zba Zbb Zbc Zbs Vector (vlen: 128) PageSize:4096 Running on Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz with 16 Cores
Will use Hardware counter measured at 2.3 GHz
Params database has 14 entries
Box64 with Dynarec v0.3.1 fc7f83c6 built on Sep 12 2024 13:54:05
BOX64: Didn't detect 48bits of address space, considering it's 39bits
Counted 32 Env var
BOX64 LIB PATH: BOX64 BIN PATH: ./:bin/:/home/ksco/.vscode-server/bin/4849ca9bdf9666755eb463db297b69e5385090e3/bin/remote-cli/:/usr/local/sbin/:/usr/local/bin/:/usr/bin/:/usr/bin/site_perl/:/usr/bin/vendor_perl/:/usr/bin/core_perl/
Looking for /usr/bin/lstopo
Rename process to "lstopo"
Using emulated /usr/lib/libhwloc.so.15
Using native(wrapped) libm.so.6
Error initializing native libncursesw.so.6 (last dlerror is libncursesw.so.6: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libncursesw.so.6
Error initializing native libcairo.so.2 (last dlerror is libcairo.so.2: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libcairo.so.2
Error initializing native libX11.so.6 (last dlerror is libX11.so.6: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libX11.so.6
Using native(wrapped) libc.so.6
Using native(wrapped) ld-linux-x86-64.so.2
Using native(wrapped) libpthread.so.0
Using native(wrapped) libdl.so.2
Using native(wrapped) libutil.so.1
Using native(wrapped) libresolv.so.2
Using native(wrapped) librt.so.1
Using native(wrapped) libbsd.so.0
Error initializing native libxcb.so.1 (last dlerror is libxcb.so.1: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libxcb.so.1
Error initializing native libXau.so.6 (last dlerror is libXau.so.6: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libXau.so.6
Error initializing native libXdmcp.so.6 (last dlerror is libXdmcp.so.6: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libXdmcp.so.6
Error initializing native libz.so.1 (last dlerror is libz.so.1: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libz.so.1
Error initializing native libpng16.so.16 (last dlerror is libpng16.so.16: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libpng16.so.16
Error initializing native libfontconfig.so.1 (last dlerror is libfontconfig.so.1: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libfontconfig.so.1
Error initializing native libfreetype.so.6 (last dlerror is libfreetype.so.6: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libfreetype.so.6
Error initializing native libXext.so.6 (last dlerror is libXext.so.6: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libXext.so.6
Error initializing native libXrender.so.1 (last dlerror is libXrender.so.1: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libXrender.so.1
Error initializing native libxcb-render.so.0 (last dlerror is libxcb-render.so.0: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libxcb-render.so.0
Error initializing native libxcb-shm.so.0 (last dlerror is libxcb-shm.so.0: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libxcb-shm.so.0
Using emulated /usr/lib/libpixman-1.so.0
Error initializing native libbz2.so.1 (last dlerror is libbz2.so.1: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libbz2.so.1.0
Error initializing native libharfbuzz.so.0 (last dlerror is libharfbuzz.so.0: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libharfbuzz.so.0
Using emulated /usr/lib/libbrotlidec.so.1
Using emulated /usr/lib/libbrotlicommon.so.1
Error initializing native libglib-2.0.so.0 (last dlerror is libglib-2.0.so.0: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libglib-2.0.so.0
Using emulated /usr/lib/libgraphite2.so.3
Using emulated /usr/lib/libpcre2-8.so.0
Error: Global Symbol __isoc23_strtoll_l not found, cannot apply R_X86_64_GLOB_DAT @0x3f1414c8f8 ((nil)) in /usr/lib/libglib-2.0.so.0
Error: Global Symbol __isoc23_strtoull_l not found, cannot apply R_X86_64_GLOB_DAT @0x3f1414cdc0 ((nil)) in /usr/lib/libglib-2.0.so.0
Error initializing native libexpat.so.1 (last dlerror is libexpat.so.1: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libexpat.so.1
Error initializing native libudev.so.1 (last dlerror is libudev.so.0: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libudev.so.1
Error initializing native libcap.so.2 (last dlerror is libcap.so.2: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libcap.so.2
Using emulated ../x64lib/libgcc_s.so.1
Error: Global Symbol __readlinkat_chk not found, cannot apply R_X86_64_GLOB_DAT @0x3f18044c10 ((nil)) in /usr/lib/libudev.so.1
Error: Global Symbol pidfd_send_signal not found, cannot apply R_X86_64_GLOB_DAT @0x3f18044ea8 ((nil)) in /usr/lib/libudev.so.1
Using emulated /usr/lib/hwloc/hwloc_xml_libxml.so
Error initializing native libxml2.so.2 (last dlerror is libxml2.so.2: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/libxml2.so.2
Error initializing native liblzma.so.5 (last dlerror is liblzma.so.5: cannot open shared object file: No such file or directory)
Using emulated /usr/lib/liblzma.so.5
Using emulated /usr/lib/libicuuc.so.75
Using emulated /usr/lib/libicudata.so.75
Using emulated ../x64lib/libstdc++.so.6
Warning: Weak Symbol _ITM_memcpyRtWn not found, cannot apply R_X86_64_JUMP_SLOT @0x3f2124f060 (0x9c0f6)
Warning: Weak Symbol _ITM_RU1 not found, cannot apply R_X86_64_JUMP_SLOT @0x3f2124f6a0 (0x9cd76)
Warning: Weak Symbol _ZGTtdlPv not found, cannot apply R_X86_64_JUMP_SLOT @0x3f2124fb00 (0x9d636)
Warning: Weak Symbol _ITM_RU8 not found, cannot apply R_X86_64_JUMP_SLOT @0x3f2124fff8 (0x9e026)
Warning: Weak Symbol _ITM_memcpyRnWt not found, cannot apply R_X86_64_JUMP_SLOT @0x3f212504a8 (0x9e986)
Warning: Weak Symbol _ZGTtnam not found, cannot apply R_X86_64_JUMP_SLOT @0x3f21250c88 (0x9f946)
Using emulated /usr/lib/hwloc/hwloc_pci.so
Using emulated /usr/lib/libpciaccess.so.0
Machine (31GB total)
  Package L#0
    NUMANode L#0 (P#0 31GB)
    L3 L#0 (16MB)
      L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#8)
      L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#1)
        PU L#3 (P#9)
      L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#2)
        PU L#5 (P#10)
      L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#3)
        PU L#7 (P#11)
      L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#4)
        PU L#9 (P#12)
      L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#5)
        PU L#11 (P#13)
      L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
        PU L#12 (P#6)
        PU L#13 (P#14)
      L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
        PU L#14 (P#7)
        PU L#15 (P#15)
  HostBridge
    PCIBridge
      PCI 01:00.0 (NVMExp)
        Block(Disk) "nvme0n1"
    PCI 00:02.0 (VGA)
    PCI 00:17.0 (SATA)
    PCIBridge
      PCI 02:00.0 (Ethernet)
        Net "enp2s0"
    PCIBridge
      PCI 03:00.0 (Network)
[ksco@dwarf build]$ 
Coekjan commented 2 months ago

Are you testing the latest box64? I'm on x86_64 ArchLinux, it works for me, logs:

I am testing on a near-latest version (6b6d20732e6e93066095261f5c3c7a2146668c99), and in GUI (KDE Plasma Wayland/Xwayland). Your post seems indicating that you are in CLI environment (so it does not open a graphic window to show the hardware resources). In addition, to run lstopo, I need to wrap some more functions. You can see them here: https://github.com/Coekjan/box64/commit/4140dd73e7626a71ce2295348ec74d3db5e61b6f

ksco commented 2 months ago

Ah okay, my ArchLinux is indeed headless.

ksco commented 2 months ago
image

So I set up X11 forward to test again, and it's still working (without any wrapping).

ksco commented 2 months ago

Are you testing this on real hardware? Do I need to? I have a SpacemiT K1 dev board.

Coekjan commented 2 months ago

Are you testing this on real hardware? Do I need to? I have a SpacemiT K1 dev board.

No, I am just testing this on my x64 laptop (with qemu-user)... And at the moment, I have no idea why you can run lstopo without any extra wrapping and why SIGILL does not occur.

Coekjan commented 2 months ago

Would its .so dependencies be different?

$ ldd $(which lstopo)
        linux-vdso.so.1 (0x00007cb3b0bea000)
        libhwloc.so.15 => /usr/lib/libhwloc.so.15 (0x00007cb3b0b18000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007cb3b0a29000)
        libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007cb3b09ba000)
        libcairo.so.2 => /usr/lib/libcairo.so.2 (0x00007cb3b0887000)
        libX11.so.6 => /usr/lib/libX11.so.6 (0x00007cb3b0746000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007cb3b0555000)
        libudev.so.1 => /usr/lib/libudev.so.1 (0x00007cb3b050d000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007cb3b0bec000)
        libz.so.1 => /usr/lib/libz.so.1 (0x00007cb3b04f4000)
        libpng16.so.16 => /usr/lib/libpng16.so.16 (0x00007cb3b04ba000)
        libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 (0x00007cb3b046a000)
        libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x00007cb3b03a0000)
        libXext.so.6 => /usr/lib/libXext.so.6 (0x00007cb3b038b000)
        libXrender.so.1 => /usr/lib/libXrender.so.1 (0x00007cb3b037d000)
        libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007cb3b0352000)
        libxcb-render.so.0 => /usr/lib/libxcb-render.so.0 (0x00007cb3b0343000)
        libxcb-shm.so.0 => /usr/lib/libxcb-shm.so.0 (0x00007cb3b033e000)
        libpixman-1.so.0 => /usr/lib/libpixman-1.so.0 (0x00007cb3b0294000)
        libcap.so.2 => /usr/lib/libcap.so.2 (0x00007cb3b0286000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007cb3b0258000)
        libexpat.so.1 => /usr/lib/libexpat.so.1 (0x00007cb3b022f000)
        libbz2.so.1.0 => /usr/lib/libbz2.so.1.0 (0x00007cb3b021c000)
        libharfbuzz.so.0 => /usr/lib/libharfbuzz.so.0 (0x00007cb3b0102000)
        libbrotlidec.so.1 => /usr/lib/libbrotlidec.so.1 (0x00007cb3b00f1000)
        libXau.so.6 => /usr/lib/libXau.so.6 (0x00007cb3b00ec000)
        libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007cb3b00e4000)
        libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0x00007cb3aff96000)
        libgraphite2.so.3 => /usr/lib/libgraphite2.so.3 (0x00007cb3aff74000)
        libbrotlicommon.so.1 => /usr/lib/libbrotlicommon.so.1 (0x00007cb3aff4f000)
        libpcre2-8.so.0 => /usr/lib/libpcre2-8.so.0 (0x00007cb3afeb0000)
ksco commented 2 months ago

No, it's the same. Anyway, can you run the following command and give me the dump.txt?

QEMU_LD_PREFIX=/usr/riscv64-linux-gnu/ QEMU_CPU=rv64,v=true,vlen=256,vext_spec=v1.0 INTERPRETER=qemu-riscv64-static BOX64_TRACE_FILE=./dump.txt BOX64_DYNAREC_DUMP=2 BOX64_LD_LIBRARY_PATH=../x64lib/:/usr/lib ./box64 /usr/bin/lstopo
Coekjan commented 2 months ago

No, it's the same. Anyway, can you run the following command and give me the dump.txt?

dump.tar.gz

ksco commented 2 months ago

Can you try again with this patch?

diff --git a/src/dynarec/rv64/dynarec_rv64_pass0.h b/src/dynarec/rv64/dynarec_rv64_pass0.h
index 782dae0b..33cd2f38 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass0.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass0.h
@@ -30,7 +30,7 @@
         dyn->e.olds[i].v = 0;                                                      \
     dyn->insts[ninst].f_entry = dyn->f;                                            \
     if (reset_n != -1)                                                             \
-        dyn->vector_sew = ninst ? dyn->insts[ninst - 1].vector_sew : VECTOR_SEWNA; \
+        dyn->vector_sew = (ninst && dyn->insts[ninst].pred_sz == 1) ? dyn->insts[ninst - 1].vector_sew : VECTOR_SEWNA; \
     if (ninst)                                                                     \
         dyn->insts[ninst - 1].x64.size = dyn->insts[ninst].x64.addr - dyn->insts[ninst - 1].x64.addr;
Coekjan commented 2 months ago

Can you try again with this patch?

diff --git a/src/dynarec/rv64/dynarec_rv64_pass0.h b/src/dynarec/rv64/dynarec_rv64_pass0.h
index 782dae0b..33cd2f38 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass0.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass0.h
@@ -30,7 +30,7 @@
         dyn->e.olds[i].v = 0;                                                      \
     dyn->insts[ninst].f_entry = dyn->f;                                            \
     if (reset_n != -1)                                                             \
-        dyn->vector_sew = ninst ? dyn->insts[ninst - 1].vector_sew : VECTOR_SEWNA; \
+        dyn->vector_sew = (ninst && dyn->insts[ninst].pred_sz == 1) ? dyn->insts[ninst - 1].vector_sew : VECTOR_SEWNA; \
     if (ninst)                                                                     \
         dyn->insts[ninst - 1].x64.size = dyn->insts[ninst].x64.addr - dyn->insts[ninst - 1].x64.addr;

SIGILL still occured with this patch, dump.tar.gz

222482|SIGILL @0x7330ebf4c048 (???(0x7330ebf4c048)) (x64pc=0x3f020a7073/"/lib/libcairo.so.2 + 0xa7073", rsp=0x7330faffa320, stack=0x7330fa800000:0x7330fb000000 own=(nil) fp=0x7330faffadc0), for accessing 0x7330ebf4c048 (code=1/prot=7), db=0x7330f974f6c0(0x7330ebf4b9b0:0x7330ebf4c578/0x3f020a6f06:0x3f020a7103//lib/libcairo.so.2 + 0xa6f06:clean, hash:89047ac9/89047ac9) handler=(nil)
RAX:0x0000000037396a60 RCX:0x0000000000000000 RDX:0x0000000037339280 RBX:0x000000003727ed90 
RSP:0x00007330faffa320 RBP:0x00007330faffadd8 RSI:0x0000000000000042 RDI:0x00007330faffa2d0 
 R8:0x00007330faffa3d0  R9:0x0000000037339090 R10:0x0000000037396aa0 R11:0x0000000000000000 
R12:0x0000000000000001 R13:0x0000000037395330 R14:0x0000000037339280 R15:0x00007330faffa590 
ES:0x002b CS:0x0033 SS:0x002b DS:0x002b FS:0x0043 GS:0x0053 
RSP-0x20:0x0000000037395240 RSP-0x18:0x0000000037396510 RSP-0x10:0x00007330faffa590 RSP-0x08:0x0000003f020a6f06
RSP+0x00:0x0000000100000f00 RSP+0x08:0x0000000037396aa0 RSP+0x10:0x000000000000000d RSP+0x18:0x00007330faffadc0 opcode=27 F5 05 02 27 38 05 0F (0F 84 B7 07 00)
ksco commented 2 months ago

Discard that, and try with this patch.. (sorry, it's so annoying I cannot reproduce this on my side.)

diff --git a/src/dynarec/dynarec_native_pass.c b/src/dynarec/dynarec_native_pass.c
index 24528fa1..6c4ba1f5 100644
--- a/src/dynarec/dynarec_native_pass.c
+++ b/src/dynarec/dynarec_native_pass.c
@@ -55,7 +55,7 @@ uintptr_t native_pass(dynarec_native_t* dyn, uintptr_t addr, int alternate, int
     #endif
     fpu_reset(dyn);
     ARCH_INIT();
-    int reset_n = -1;
+    int reset_n = -1; // -1 no reset; -2 reset to 0; else reset to the state of reset_n
     dyn->last_ip = (alternate || (dyn->insts && dyn->insts[0].pred_sz))?0:ip;  // RIP is always set at start of block unless there is a predecessor!
     int stopblock = 2+(FindElfAddress(my_context, addr)?0:1); // if block is in elf_memory, it can be extended with box64_dynarec_bigblock==2, else it needs 3
     // ok, go now
diff --git a/src/dynarec/rv64/dynarec_rv64_pass0.h b/src/dynarec/rv64/dynarec_rv64_pass0.h
index 782dae0b..9a84ab5b 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass0.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass0.h
@@ -29,7 +29,7 @@
     for (int i = 0; i < 16; ++i)                                                   \
         dyn->e.olds[i].v = 0;                                                      \
     dyn->insts[ninst].f_entry = dyn->f;                                            \
-    if (reset_n != -1)                                                             \
+    if (reset_n == -1)                                                             \
         dyn->vector_sew = ninst ? dyn->insts[ninst - 1].vector_sew : VECTOR_SEWNA; \
     if (ninst)                                                                     \
         dyn->insts[ninst - 1].x64.size = dyn->insts[ninst].x64.addr - dyn->insts[ninst - 1].x64.addr;
Coekjan commented 2 months ago

Discard that, and try with this patch.. (sorry, it's so annoying I cannot reproduce this on my side.)

diff --git a/src/dynarec/dynarec_native_pass.c b/src/dynarec/dynarec_native_pass.c
index 24528fa1..6c4ba1f5 100644
--- a/src/dynarec/dynarec_native_pass.c
+++ b/src/dynarec/dynarec_native_pass.c
@@ -55,7 +55,7 @@ uintptr_t native_pass(dynarec_native_t* dyn, uintptr_t addr, int alternate, int
     #endif
     fpu_reset(dyn);
     ARCH_INIT();
-    int reset_n = -1;
+    int reset_n = -1; // -1 no reset; -2 reset to 0; else reset to the state of reset_n
     dyn->last_ip = (alternate || (dyn->insts && dyn->insts[0].pred_sz))?0:ip;  // RIP is always set at start of block unless there is a predecessor!
     int stopblock = 2+(FindElfAddress(my_context, addr)?0:1); // if block is in elf_memory, it can be extended with box64_dynarec_bigblock==2, else it needs 3
     // ok, go now
diff --git a/src/dynarec/rv64/dynarec_rv64_pass0.h b/src/dynarec/rv64/dynarec_rv64_pass0.h
index 782dae0b..9a84ab5b 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass0.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass0.h
@@ -29,7 +29,7 @@
     for (int i = 0; i < 16; ++i)                                                   \
         dyn->e.olds[i].v = 0;                                                      \
     dyn->insts[ninst].f_entry = dyn->f;                                            \
-    if (reset_n != -1)                                                             \
+    if (reset_n == -1)                                                             \
         dyn->vector_sew = ninst ? dyn->insts[ninst - 1].vector_sew : VECTOR_SEWNA; \
     if (ninst)                                                                     \
         dyn->insts[ninst - 1].x64.size = dyn->insts[ninst].x64.addr - dyn->insts[ninst - 1].x64.addr;

This patch fixes the bug. No SIGILL now on my side! Thanks for your kind replies!!

ksco commented 2 months ago

great! can you upload the dump.txt again? I just want to make sure a few other things works as expected on the SEW transformation, thanks.

Coekjan commented 2 months ago

can you upload the dump.txt again? I just want to make sure a few other things works as expected on the SEW transformation, thanks.

dump.tar.gz

Coekjan commented 2 months ago

I think we can be glad to close this via #1819 . Thanks again to you!

ksco commented 2 months ago

@Coekjan Hi can you test if lstopo still works for you after applying the above PR?

Coekjan commented 2 months ago

@Coekjan Hi can you test if lstopo still works for you after applying the above PR?

:confused: SIGILL came back for me...

I tested with b504d07afe4f7d06b50e8347074be9a1d0451a9d (the latest commit) just now. It happened at another place:

161429|SIGILL @0x7d958c3b4910 (???(0x7d958c3b4910)) (x64pc=0x3f1700e299/"/usr/lib/libexpat.so.1/XML_SetUserData + 0x29", rsp=0x7d957fff6798, stack=0x7d957f800000:0x7d9580000000 own=(nil) fp=0x37285060), for accessing 0x7d958c3b4910 (code=1/prot=7), db=0x7d957f61d240(0x7d958c3b4840:0x7d958c3b49b0/0x3f1700e270:0x3f1700e29d//usr/lib/libexpat.so.1/XML_SetUserData:clean, hash:d12fb9d1/d12fb9d1) handler=(nil)
RAX:0x0000000000000000 RCX:0x0000000000000000 RDX:0x0000000000000000 RBX:0x0000000037287fb0 
RSP:0x00007d957fff6798 RBP:0x0000000037287f30 RSI:0x00007d957fff67e0 RDI:0x0000000037287fb0 
 R8:0x0000000000000001  R9:0x0000000000000001 R10:0x0000000000000000 R11:0x00007d957fff7bb8 
R12:0x0000000000000001 R13:0x000000003728a160 R14:0x00007d957fff67e0 R15:0x0000000000000a8c 
ES:0x002b CS:0x0033 SS:0x002b DS:0x002b FS:0x0043 GS:0x0053 
RSP-0x20:0x0000000037287fb0 RSP-0x18:0x0000000037287f30 RSP-0x10:0x0000000000000001 RSP-0x08:0x000000003728a160
RSP+0x00:0x0000003f0902d90a RSP+0x08:0x0000000000000001 RSP+0x10:0x0000000000000000 RSP+0x18:0x0000000000000000 opcode=27 F5 0B 02 93 05 05 09 (0F 11 07 C3 0F)
[1]    161429 illegal hardware instruction (core dumped)  ./box64 lstopo

Dump file here: dump.tar.gz

ksco commented 2 months ago

Okay, seems there is one thing I didn't consider when refactoring, can you try this?

diff --git a/src/dynarec/rv64/dynarec_rv64_helper.c b/src/dynarec/rv64/dynarec_rv64_helper.c
index fb1ad81d..4be3f8ae 100644
--- a/src/dynarec/rv64/dynarec_rv64_helper.c
+++ b/src/dynarec/rv64/dynarec_rv64_helper.c
@@ -2528,7 +2528,7 @@ void fpu_reset_cache(dynarec_rv64_t* dyn, int ninst, int reset_n)
     #if STEP > 1
     // for STEP 2 & 3, just need to refresh with current, and undo the changes (push & swap)
     dyn->e = dyn->insts[ninst].e;
-    dyn->vector_sew = dyn->insts[ninst].vector_sew_exit;
+    dyn->vector_sew = dyn->insts[ninst].vector_sew_entry;
     #else
     dyn->e = dyn->insts[reset_n].e;
     dyn->vector_sew = dyn->insts[reset_n].vector_sew_exit;
diff --git a/src/dynarec/rv64/dynarec_rv64_pass2.h b/src/dynarec/rv64/dynarec_rv64_pass2.h
index 909522e3..4f41f628 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass2.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass2.h
@@ -8,6 +8,7 @@
 #define MESSAGE(A, ...) do {} while (0)
 #define EMIT(A)     do {dyn->insts[ninst].size+=4; dyn->native_size+=4;}while(0)
 #define NEW_INST                                                                                                                                                               \
+    dyn->vector_sew = dyn->insts[ninst].vector_sew_entry;                                                                                                                      \
     if (ninst) {                                                                                                                                                               \
         dyn->insts[ninst].address = (dyn->insts[ninst - 1].address + dyn->insts[ninst - 1].size);                                                                              \
         dyn->insts_size += 1 + ((dyn->insts[ninst - 1].x64.size > (dyn->insts[ninst - 1].size / 4)) ? dyn->insts[ninst - 1].x64.size : (dyn->insts[ninst - 1].size / 4)) / 15; \
diff --git a/src/dynarec/rv64/dynarec_rv64_pass3.h b/src/dynarec/rv64/dynarec_rv64_pass3.h
index 1254dc4a..556586f2 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass3.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass3.h
@@ -13,6 +13,7 @@

 #define MESSAGE(A, ...)  if(box64_dynarec_dump) dynarec_log(LOG_NONE, __VA_ARGS__)
 #define NEW_INST                                                                                                  \
+    dyn->vector_sew = dyn->insts[ninst].vector_sew_entry;                                                         \
     if (box64_dynarec_dump) print_newinst(dyn, ninst);                                                            \
     if (ninst) {                                                                                                  \
         addInst(dyn->instsize, &dyn->insts_size, dyn->insts[ninst - 1].x64.size, dyn->insts[ninst - 1].size / 4); \
Coekjan commented 2 months ago

Okay, seems there is one thing I didn't consider when refactoring, can you try this?

diff --git a/src/dynarec/rv64/dynarec_rv64_helper.c b/src/dynarec/rv64/dynarec_rv64_helper.c
index fb1ad81d..4be3f8ae 100644
--- a/src/dynarec/rv64/dynarec_rv64_helper.c
+++ b/src/dynarec/rv64/dynarec_rv64_helper.c
@@ -2528,7 +2528,7 @@ void fpu_reset_cache(dynarec_rv64_t* dyn, int ninst, int reset_n)
     #if STEP > 1
     // for STEP 2 & 3, just need to refresh with current, and undo the changes (push & swap)
     dyn->e = dyn->insts[ninst].e;
-    dyn->vector_sew = dyn->insts[ninst].vector_sew_exit;
+    dyn->vector_sew = dyn->insts[ninst].vector_sew_entry;
     #else
     dyn->e = dyn->insts[reset_n].e;
     dyn->vector_sew = dyn->insts[reset_n].vector_sew_exit;
diff --git a/src/dynarec/rv64/dynarec_rv64_pass2.h b/src/dynarec/rv64/dynarec_rv64_pass2.h
index 909522e3..4f41f628 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass2.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass2.h
@@ -8,6 +8,7 @@
 #define MESSAGE(A, ...) do {} while (0)
 #define EMIT(A)     do {dyn->insts[ninst].size+=4; dyn->native_size+=4;}while(0)
 #define NEW_INST                                                                                                                                                               \
+    dyn->vector_sew = dyn->insts[ninst].vector_sew_entry;                                                                                                                      \
     if (ninst) {                                                                                                                                                               \
         dyn->insts[ninst].address = (dyn->insts[ninst - 1].address + dyn->insts[ninst - 1].size);                                                                              \
         dyn->insts_size += 1 + ((dyn->insts[ninst - 1].x64.size > (dyn->insts[ninst - 1].size / 4)) ? dyn->insts[ninst - 1].x64.size : (dyn->insts[ninst - 1].size / 4)) / 15; \
diff --git a/src/dynarec/rv64/dynarec_rv64_pass3.h b/src/dynarec/rv64/dynarec_rv64_pass3.h
index 1254dc4a..556586f2 100644
--- a/src/dynarec/rv64/dynarec_rv64_pass3.h
+++ b/src/dynarec/rv64/dynarec_rv64_pass3.h
@@ -13,6 +13,7 @@

 #define MESSAGE(A, ...)  if(box64_dynarec_dump) dynarec_log(LOG_NONE, __VA_ARGS__)
 #define NEW_INST                                                                                                  \
+    dyn->vector_sew = dyn->insts[ninst].vector_sew_entry;                                                         \
     if (box64_dynarec_dump) print_newinst(dyn, ninst);                                                            \
     if (ninst) {                                                                                                  \
         addInst(dyn->instsize, &dyn->insts_size, dyn->insts[ninst - 1].x64.size, dyn->insts[ninst - 1].size / 4); \

It works well now with your patch again... (This looks like magic to me :rofl: )

ksco commented 2 months ago

LOL sorry for the endless testing... it really upset me the fact that I cannot reproduce this on my side.. maybe it's something related to Wayland?

Coekjan commented 2 months ago

LOL sorry for the endless testing...

I hope it ends now :laughing: . BTW, I would suggest that when fixing something, related tests should be added (to avoid regression), though I don't know whether it is hard to do so (i.e. construct a MRE and test it in CI).

it really upset me the fact that I cannot reproduce this on my side.. maybe it's something related to Wayland?

Ah maybe... The program might walk different code-paths in wayland.

ksco commented 2 months ago

BTW, I would suggest that when fixing something, related tests should be added (to avoid regression), though I don't know whether it is hard to do so (i.e. construct a MRE and test it in CI).

Indeed, but the issues are from vector infrastructure, and they are not the same before and after the refactor, so it's not an easy task to come up with such tests... (btw this is just me making excuses for not writing tests).

Coekjan commented 2 months ago

Well, let's close this issue with #1824 and hope that we won't reopen it again 🙏