Closed sanderjo closed 2 years ago
@llvm/issue-subscribers-backend-risc-v
No, the +0x… is an offset within the loader library, not part of the file name.
This is a T-HEAD C906 / Allwinner D1 based board, right? I believe Debian’s LLVM is built with LLVM and so my guess is it ends up using FENCE.TSO, which that core fails to implement, in violation of the spec. GCC happens to not use that instruction for its atomics, but LLVM does for some. If that’s the case, it needs fixing by having firmware emulate the instructions the CPU forgot to implement.
If you run Clang under GDB you should be able to see what it’s trying to execute that fails.
Thanks for your reply!
This is a T-HEAD C906 / Allwinner D1 based board, right?
Right! The Sipeed Lichee RV Dock, with the Allwinner D1.
If you run Clang under GDB you should be able to see what it’s trying to execute that fails.
I'm a gdb noob too, so ... is this how I should do it:
$ gdb -ex=r --args clang-14 --version
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "riscv64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from clang-14...
(No debugging symbols found in clang-14)
Starting program: /usr/bin/clang-14 --version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/riscv64-linux-gnu/libthread_db.so.1".
Program received signal SIGILL, Illegal instruction.
0x0000003ff09eaf16 in ?? () from /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1
(gdb) bt
#0 0x0000003ff09eaf16 in ?? () from /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1
(gdb)
... then ... based on google hit https://stackoverflow.com/questions/1902901/show-current-assembly-instruction-in-gdb
(gdb) layout asm
which gives
(gdb) layout asm
lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
x >0x3ff09eaf16 fence.tso x
x 0x3ff09eaf1a auipc a0,0x49c6 x
x 0x3ff09eaf1e ld a0,430(a0) x
x 0x3ff09eaf22 lbu a1,0(a0) x
x 0x3ff09eaf26 addi a0,s1,52 x
x 0x3ff09eaf2a beqz a1,0x3ff09eaf76 x
x 0x3ff09eaf2c lw a1,0(a0) x
x 0x3ff09eaf2e addiw a2,a1,-1 x
x 0x3ff09eaf32 sw a2,0(a0) x
x 0x3ff09eaf34 sext.w a0,a1 x
x 0x3ff09eaf38 li a1,1 x
x 0x3ff09eaf3a bne a0,a1,0x3ff09eaf46 x
x 0x3ff09eaf3e ld a0,0(s1) x
x 0x3ff09eaf40 ld a1,24(a0) x
x 0x3ff09eaf42 mv a0,s1 x
x 0x3ff09eaf44 jalr a1 x
x 0x3ff09eaf46 auipc a1,0x49ca x
x 0x3ff09eaf4a ld a1,466(a1) x
x 0x3ff09eaf4e ld a0,8(s0) x
x 0x3ff09eaf50 addi a1,a1,16 x
x 0x3ff09eaf52 addi a2,s0,24 x
x 0x3ff09eaf56 sd a1,0(s0) x
x 0x3ff09eaf58 beq a0,a2,0x3ff09eaf6c x
x 0x3ff09eaf5c ld ra,24(sp) x
x 0x3ff09eaf5e ld s0,16(sp) x
x 0x3ff09eaf60 ld s1,8(sp) x
x 0x3ff09eaf62 addi sp,sp,32 x
x 0x3ff09eaf64 auipc t1,0xffe95 x
x 0x3ff09eaf68 jr -516(t1) x
x 0x3ff09eaf6c ld ra,24(sp) x
mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
multi-thre Thread 0x3fec70b4d0 In: L?? PC: 0x3ff09eaf16
(gdb)
Thanks - that confirms it's the fence.tso issue. See #50090 for a relevant discussion of the issue. As @jrtc27 suggests, the best workaround is really for D1 systems to have a trap handler that calls fence. There are other options, but I suspect the D1 won't see much use once faster Linux-capable cores hit the market, and hopefully there'll be a silicon revision that fixes the bug.
OK. Thanks for the explanation and pointer. I guess this bug in D1 is a typical effect of the decentralized development of RISC-V CPU's.
For further questions I'll use that other issue.
Debian on RISC-V: clang-14 (and clang-13) gives the following.
Tips how to solve? Or will I get flamed because it's a Debian thing, or a noob thing?
Hmmm, not good. Let's see what does exist:
So clang-14 is pointing to a non-existing file?
A symlink does not solve it:
Tips how to proceed?