llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.03k stars 11.96k forks source link

clang-14 on RISC-V 64bit: Illegal instruction / #0 0x0000003fc00dacd8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8) #54480

Closed sanderjo closed 2 years ago

sanderjo commented 2 years ago

Debian on RISC-V: clang-14 (and clang-13) gives the following.

Tips how to solve? Or will I get flamed because it's a Debian thing, or a noob thing?

$ clang-14 --version
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang-14 --version
1.      Compilation construction
#0 0x0000003fc00dacd8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8)
Illegal instruction
sipeed@sipeed:~$ ll /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8
ls: cannot access '/usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8': No such file or directory

Hmmm, not good. Let's see what does exist:

sipeed@sipeed:~$ ll /usr/lib/riscv64-linux-gnu/libLLVM-1*
lrwxrwxrwx 1 root root       15 Mar  6 01:50 /usr/lib/riscv64-linux-gnu/libLLVM-13.0.1.so.1 -> libLLVM-13.so.1
lrwxrwxrwx 1 root root       15 Mar  6 01:50 /usr/lib/riscv64-linux-gnu/libLLVM-13.so -> libLLVM-13.so.1
-rw-r--r-- 1 root root 82786872 Mar  6 01:50 /usr/lib/riscv64-linux-gnu/libLLVM-13.so.1
lrwxrwxrwx 1 root root       15 Mar 11 18:28 /usr/lib/riscv64-linux-gnu/libLLVM-14.0.0.so.1 -> libLLVM-14.so.1
lrwxrwxrwx 1 root root       15 Mar 11 18:28 /usr/lib/riscv64-linux-gnu/libLLVM-14.so -> libLLVM-14.so.1
-rw-r--r-- 1 root root 92293248 Mar 11 18:28 /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1

So clang-14 is pointing to a non-existing file?

A symlink does not solve it:

sipeed@sipeed:~$ sudo ln -s /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1 /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8

sipeed@sipeed:~$ file /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8
/usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8: symbolic link to /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1

sipeed@sipeed:~$ clang-14 --version
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang-14 --version
1.      Compilation construction
#0 0x0000003fdd756cd8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/riscv64-linux-gnu/libLLVM-14.so.1+0xe6ccd8)
Illegal instruction

Tips how to proceed?

sipeed@sipeed:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux bookworm/sid
Release:        unstable
Codename:       sid
sipeed@sipeed:~$ uname -a
Linux sipeed 5.4.61 #217 PREEMPT Thu Dec 30 06:50:31 UTC 2021 riscv64 GNU/Linux
sipeed@sipeed:~$
llvmbot commented 2 years ago

@llvm/issue-subscribers-backend-risc-v

jrtc27 commented 2 years ago

No, the +0x… is an offset within the loader library, not part of the file name.

This is a T-HEAD C906 / Allwinner D1 based board, right? I believe Debian’s LLVM is built with LLVM and so my guess is it ends up using FENCE.TSO, which that core fails to implement, in violation of the spec. GCC happens to not use that instruction for its atomics, but LLVM does for some. If that’s the case, it needs fixing by having firmware emulate the instructions the CPU forgot to implement.

If you run Clang under GDB you should be able to see what it’s trying to execute that fails.

sanderjo commented 2 years ago

Thanks for your reply!

This is a T-HEAD C906 / Allwinner D1 based board, right?

Right! The Sipeed Lichee RV Dock, with the Allwinner D1.

If you run Clang under GDB you should be able to see what it’s trying to execute that fails.

I'm a gdb noob too, so ... is this how I should do it:

$ gdb -ex=r --args clang-14 --version
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "riscv64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from clang-14...
(No debugging symbols found in clang-14)
Starting program: /usr/bin/clang-14 --version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/riscv64-linux-gnu/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
0x0000003ff09eaf16 in ?? () from /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1
(gdb) bt
#0  0x0000003ff09eaf16 in ?? () from /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1
(gdb)

... then ... based on google hit https://stackoverflow.com/questions/1902901/show-current-assembly-instruction-in-gdb

(gdb) layout asm

which gives

(gdb) layout asm
lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
x  >0x3ff09eaf16        fence.tso                                                                                                                                                           x
x   0x3ff09eaf1a        auipc       a0,0x49c6                                                                                                                                               x
x   0x3ff09eaf1e        ld  a0,430(a0)                                                                                                                                                      x
x   0x3ff09eaf22        lbu a1,0(a0)                                                                                                                                                        x
x   0x3ff09eaf26        addi        a0,s1,52                                                                                                                                                x
x   0x3ff09eaf2a        beqz        a1,0x3ff09eaf76                                                                                                                                         x
x   0x3ff09eaf2c        lw  a1,0(a0)                                                                                                                                                        x
x   0x3ff09eaf2e        addiw       a2,a1,-1                                                                                                                                                x
x   0x3ff09eaf32        sw  a2,0(a0)                                                                                                                                                        x
x   0x3ff09eaf34        sext.w      a0,a1                                                                                                                                                   x
x   0x3ff09eaf38        li  a1,1                                                                                                                                                            x
x   0x3ff09eaf3a        bne a0,a1,0x3ff09eaf46                                                                                                                                              x
x   0x3ff09eaf3e        ld  a0,0(s1)                                                                                                                                                        x
x   0x3ff09eaf40        ld  a1,24(a0)                                                                                                                                                       x
x   0x3ff09eaf42        mv  a0,s1                                                                                                                                                           x
x   0x3ff09eaf44        jalr        a1                                                                                                                                                      x
x   0x3ff09eaf46        auipc       a1,0x49ca                                                                                                                                               x
x   0x3ff09eaf4a        ld  a1,466(a1)                                                                                                                                                      x
x   0x3ff09eaf4e        ld  a0,8(s0)                                                                                                                                                        x
x   0x3ff09eaf50        addi        a1,a1,16                                                                                                                                                x
x   0x3ff09eaf52        addi        a2,s0,24                                                                                                                                                x
x   0x3ff09eaf56        sd  a1,0(s0)                                                                                                                                                        x
x   0x3ff09eaf58        beq a0,a2,0x3ff09eaf6c                                                                                                                                              x
x   0x3ff09eaf5c        ld  ra,24(sp)                                                                                                                                                       x
x   0x3ff09eaf5e        ld  s0,16(sp)                                                                                                                                                       x
x   0x3ff09eaf60        ld  s1,8(sp)                                                                                                                                                        x
x   0x3ff09eaf62        addi        sp,sp,32                                                                                                                                                x
x   0x3ff09eaf64        auipc       t1,0xffe95                                                                                                                                              x
x   0x3ff09eaf68        jr  -516(t1)                                                                                                                                                        x
x   0x3ff09eaf6c        ld  ra,24(sp)                                                                                                                                                       x
mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
multi-thre Thread 0x3fec70b4d0 In:                                                                                                                                    L??   PC: 0x3ff09eaf16
(gdb)
asb commented 2 years ago

Thanks - that confirms it's the fence.tso issue. See #50090 for a relevant discussion of the issue. As @jrtc27 suggests, the best workaround is really for D1 systems to have a trap handler that calls fence. There are other options, but I suspect the D1 won't see much use once faster Linux-capable cores hit the market, and hopefully there'll be a silicon revision that fixes the bug.

sanderjo commented 2 years ago

OK. Thanks for the explanation and pointer. I guess this bug in D1 is a typical effect of the decentralized development of RISC-V CPU's.

For further questions I'll use that other issue.