Microsemi / switchtec-kernel

A kernel module for the Microsemi PCIe switch
GNU General Public License v2.0
45 stars 31 forks source link

Installing compiled switchtec-kernel module hangs the whole system #89

Open Kajtek opened 4 years ago

Kajtek commented 4 years ago

I compiled the latest switchec-kernel from the GitHub but after running: insmod switchtec.ko the whole system hangs immediately, and setup needs to be restarted.

Setup information:

user@host$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.4 LTS"

Kernel:

user@host$ uname -a
Linux host 5.3.0-62-generic #56~18.04.1-Ubuntu SMP Wed Jun 24 16:17:03 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Compiler:

user@host$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 

Commands:

user@host$ git clone https://github.com/Microsemi/switchtec-kernel.git
Cloning into 'switchtec-kernel'...
remote: Enumerating objects: 66, done.
remote: Counting objects: 100% (66/66), done.
remote: Compressing objects: 100% (47/47), done.
remote: Total 1825 (delta 36), reused 45 (delta 18), pack-reused 1759
Receiving objects: 100% (1825/1825), 454.96 KiB | 8.92 MiB/s, done.
Resolving deltas: 100% (1183/1183), done.
user@host$ cd switchtec-kernel/
user@host$ make
  VER   1.4
make -C /lib/modules/5.3.0-62-generic/build M=$PWD modules
make[1]: Entering directory '/usr/src/linux-headers-5.3.0-62-generic'
  CC [M]  /home/user/switchtec-kernel/switchtec.o
  CC [M]  /home/user/switchtec-kernel/ntb_hw_switchtec.o
  Building modules, stage 2.
  MODPOST 2 modules
  CC      /home/user/switchtec-kernel/ntb_hw_switchtec.mod.o
  LD [M]  /home/user/switchtec-kernel/ntb_hw_switchtec.ko
  CC      /home/user/switchtec-kernel/switchtec.mod.o
  LD [M] /home/user/switchtec-kernel/switchtec.ko
make[1]: Leaving directory '/usr/src/linux-headers-5.3.0-62-generic'
user@host$ sudo insmod switchtec.ko # After this the whole system does not response to anything. Restart is required.
user@host$ git clone https://github.com/Microsemi/switchtec-kernel.git
Cloning into 'switchtec-kernel'...
remote: Enumerating objects: 66, done.
remote: Counting objects: 100% (66/66), done.
remote: Compressing objects: 100% (47/47), done.
remote: Total 1825 (delta 36), reused 45 (delta 18), pack-reused 1759
Receiving objects: 100% (1825/1825), 454.96 KiB | 8.92 MiB/s, done.
Resolving deltas: 100% (1183/1183), done.
user@host$ cd switchtec-kernel/
user@host$ make
  VER   1.4
make -C /lib/modules/5.3.0-62-generic/build M=$PWD modules
make[1]: Entering directory '/usr/src/linux-headers-5.3.0-62-generic'
  CC [M]  /home/user/switchtec-kernel/switchtec.o
  CC [M]  /home/user/switchtec-kernel/ntb_hw_switchtec.o
  Building modules, stage 2.
  MODPOST 2 modules
  CC      /home/user/switchtec-kernel/ntb_hw_switchtec.mod.o
  LD [M]  /home/user/switchtec-kernel/ntb_hw_switchtec.ko
  CC      /home/user/switchtec-kernel/switchtec.mod.o
  LD [M] /home/user/switchtec-kernel/switchtec.ko
make[1]: Leaving directory '/usr/src/linux-headers-5.3.0-62-generic'
user@host$ sudo insmod switchtec.ko # After this the whole system does not respond to anything. Restart is required.

After restart syslog has no information about this event.

I have tried also to use official version of the module in upstream, because it is available but it does not work as described on the GitHub page.

Commands for upstream version:

user@host$ sudo modprobe switchtec
user@host$ lsmod | grep switchtec
switchtec              36864  0
user@host$ ls /dev/switchtec*
ls: cannot access '/dev/switchtec*': No such file or directory

After loading the module there is no /dev/switchtec# devices. Do you know why?

I attached lscpi -vv log, and dmesg log after modprobe command.

dmesg.log lspci.log

lsgunth commented 4 years ago

Looks like you have a Gen4 switch and support for that was only added very recently in the upstream kernel (v5.6).

I have no idea how the switchtec module can hard crash your system without a kernel panic or anything. Kind of has to be a hardware bug for that to happen.