Closed gvarlet closed 1 year ago
I have done the exact same procedure with kernel 5.08 and it works.
Please reproduce/investigate this problem. 13MD05-90_02_05 shall be usable with 5.11/5.12.
For info :
Linux dua-MEN-F026L00 5.11.0-38-generic #42
allows to load drivers also.
Hi, I have been able to reproduce the error:
I am investigating possible solutions to this one. It seems that the kernel modules are not being loaded because they are not signed. The first solution i tried is to disable the secure boot on the Bios but also using the mokutil tool: sudo mokutil --disable-validation
Another of the posible solutions was to change the .config file and change the values of: CONFIG_MODULE_SIG=y CONFIG_MODULE_SIG_ALL=y to: CONFIG_MODULE_SIG=n CONFIG_MODULE_SIG_ALL=n I did this also but seems to have no efect and the modules load keeps failing. The better solution seems to be signing the modules wich is the aproach I am taking now.
We have loaded the 5.11.0 mainline kernel and it loads the modules correctly. Now, knowing this we are comparing the differences between this version and the failed versions to find the root cause of this error.
Dear all,
After some days investigating what is the cause of the error, I have found a solution. First of all and after talk with @mad-jsanjuan, we have determined that this issue is strictly related with #246. It is also reproducible not only by mdis modules but all modules compiled against kernel 5.11.0-41 headers and so on (it is reproducible also with latest kernel of ubuntu 22.04).
Additionally, this bug is also reproducible using VMs.
For this example, @mad-jsanjuan and I have used a simple kernel module example that provides 2 basic kernel modules. You can find them here: https://github.com/dwmkerr/linux-kernel-module.
In a summarized way, The issue is caused by the MDIS build system, that is executing the kernel headers' Makefile, modifying the .config. It seems like in the kernel versions (and in newer versions) several kernel configuration entries are different because, after the execution of the MDIS build system, we get different configurations within .config.
Take a look:
--- .config.old 2021-11-10 10:56:15.000000000 +0100
+++ .config 2022-11-23 11:42:51.603885663 +0100
@@ -1,10 +1,10 @@
#
# Automatically generated file; DO NOT EDIT.
-# Linux/x86 5.11.0-41-generic Kernel Configuration
+# Linux/x86 5.11.22 Kernel Configuration
#
-CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0"
+CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
CONFIG_CC_IS_GCC=y
-CONFIG_GCC_VERSION=90300
+CONFIG_GCC_VERSION=90400
CONFIG_LD_VERSION=234000000
CONFIG_CLANG_VERSION=0
CONFIG_LLD_VERSION=0
@@ -4783,7 +4783,6 @@
#
# PCI GPIO expanders
#
-CONFIG_GPIO_AAEON=m
CONFIG_GPIO_AMD8111=m
CONFIG_GPIO_ML_IOH=m
CONFIG_GPIO_PCI_IDIO_16=m
@@ -4930,7 +4929,6 @@
#
# Native drivers
#
-CONFIG_SENSORS_AAEON=m
CONFIG_SENSORS_ABITUGURU=m
CONFIG_SENSORS_ABITUGURU3=m
CONFIG_SENSORS_AD7314=m
@@ -5255,7 +5253,6 @@
CONFIG_INTEL_MEI_WDT=m
CONFIG_NI903X_WDT=m
CONFIG_NIC7018_WDT=m
-CONFIG_AAEON_IWMI_WDT=m
CONFIG_MEN_A21_WDT=m
CONFIG_XEN_WDT=m
@@ -5420,7 +5417,6 @@
CONFIG_MFD_WM8350_I2C=y
CONFIG_MFD_WM8994=m
CONFIG_MFD_WCD934X=m
-CONFIG_MFD_AAEON=m
CONFIG_RAVE_SP_CORE=m
CONFIG_MFD_INTEL_M10_BMC=m
# end of Multifunction device drivers
@@ -7925,7 +7921,6 @@
# LED drivers
#
CONFIG_LEDS_88PM860X=m
-CONFIG_LEDS_AAEON=m
CONFIG_LEDS_APU=m
CONFIG_LEDS_AS3645A=m
CONFIG_LEDS_LM3530=m
@@ -9769,7 +9764,6 @@
#
# Ubuntu Supplied Third-Party Device Drivers
#
-CONFIG_UBUNTU_ODM_DRIVERS=y
CONFIG_HIO=m
CONFIG_UBUNTU_HOST=m
# end of Ubuntu Supplied Third-Party Device Drivers
@@ -10742,8 +10736,6 @@
# CONFIG_DEBUG_INFO_SPLIT is not set
CONFIG_DEBUG_INFO_DWARF4=y
CONFIG_DEBUG_INFO_BTF=y
-CONFIG_PAHOLE_HAS_SPLIT_BTF=y
-CONFIG_DEBUG_INFO_BTF_MODULES=y
CONFIG_GDB_SCRIPTS=y
CONFIG_FRAME_WARN=1024
# CONFIG_STRIP_ASM_SYMS is not set
To be concrete, the line that is causing the issue is:
-CONFIG_DEBUG_INFO_BTF_MODULES=y
So, the problem here is that the kernel and our modules are built with different kernel configuration, for this reason, neither our MDIS modules nor other kind of module compiled after .config changed, are not able to be loaded, reporting always the same error.
In older kernel versions (ie: kernel 5.4.0-132) the .config file does not change so the kernel and the modules are built with the same kernel configuration.
Steps to reproduce:
sudo apt-get purge linux-headers-5.11.0-41
sudo apt-get install linux-headers-5.11.0-41
At this point, we have a clean kernel headers for the same version we are running.
Then, we clone & compile one of the kernels of the example we have mentioned above. For example, babel module.
cd babel && make
sudo insmod babel.ko
At this point, we can check that the module is properly loaded:
[ 234.966269] babel: loading out-of-tree module taints kernel.
[ 234.966347] babel: module verification failed: signature and/or required key missing - tainting kernel
[ 234.966851] babel: module loaded at 0x00000000685fc906
[ 234.966860] babel: registered correctly with major number 236
[ 234.966900] babel: device class registered correctly
[ 234.967045] babel: device class created correctly
Note that at this point, the kernel throws the same lines mentioned in previous comments, complaining about loading kernels out-of-tree.
sudo make clean && sudo make && sudo make install
As you can see below, the kernel's Makefile is executed (that is actually the part where everything goes wrong)
men@men-MEN-F026L00:~/MDIS/13MD05-90$ sudo make
Getting Compiler/Linker settings from Linux Kernel Makefile
SYNC include/config/auto.conf.cmd
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
LEX scripts/kconfig/lexer.lex.c
YACC scripts/kconfig/parser.tab.[ch]
HOSTCC scripts/kconfig/lexer.lex.o
HOSTCC scripts/kconfig/parser.tab.o
HOSTCC scripts/kconfig/preprocess.o
HOSTCC scripts/kconfig/symbol.o
HOSTCC scripts/kconfig/util.o
HOSTLD scripts/kconfig/conf
Cleaning .kernelsubdirs
++++++++ Preparing non-debug version of module men_mdis_kernel +++++++++++
Directory OBJ/nodbg/men_mdis_kernel created
....
Once the MDIS drivers are compiled & installed, we reboot the test setup to get a "fresh" system.
make clean && make
men@men-MEN-F026L00:~/software/linux-kernel-module/babel$ sudo insmod babel.ko
insmod: ERROR: could not insert module babel.ko: Invalid module format
The dmesg output:
men@men-MEN-F026L00:~/software/linux-kernel-module/babel$ dmesg
....
[ 4437.919492] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 000000008aa8b00b, val ffffffffc0a7d270
As you can see, we get the same error in dmesg.
For fixing that, first of all, we have reinstalled again (as in steps 1 and 2) the kernel headers to get a "fresh" kernel headers. then, we have removed the line that actually includes the kernel's header Makefile in the kernelsettings.mak file (in the installed Makefile, located in /opt/menlinux/)
--- a/MDISforLinux/BUILD/MDIS/TPL/kernelsettings.mak
+++ b/MDISforLinux/BUILD/MDIS/TPL/kernelsettings.mak
@@ -28,8 +28,6 @@
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
-include Makefile
-
KERNEL_SETTINGS_FILE ?= /dev/null
.DEFAULT_GOAL := getsettings_for_mdis
After that, we recompile again the MDIS modules, paying attention that the kernel's Makefile is not actually invoked.
sudo make clean && sudo make && sudo make install
During the compilation we get some messages, notifying that BTF won't be generated
LD [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_chameleon_pcitbl/men_bb_chameleon_pcitbl.ko
BTF [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_chameleon_pcitbl/men_bb_chameleon_pcitbl.ko
Skipping BTF generation for /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_chameleon_pcitbl/men_bb_chameleon_pcitbl.ko due to unavailability of vmlinux
CC [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203/men_bb_d203.mod.o
LD [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203/men_bb_d203.ko
BTF [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203/men_bb_d203.ko
Skipping BTF generation for /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203/men_bb_d203.ko due to unavailability of vmlinux
CC [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203_a24/men_bb_d203_a24.mod.o
LD [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203_a24/men_bb_d203_a24.ko
BTF [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203_a24/men_bb_d203_a24.ko
Skipping BTF generation for /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_d203_a24/men_bb_d203_a24.ko due to unavailability of vmlinux
CC [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_smb2/men_bb_smb2.mod.o
LD [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_smb2/men_bb_smb2.ko
BTF [M] /home/men/MDIS/13MD05-90/OBJ/nodbg/men_bb_smb2/men_bb_smb2.ko
After install the modules, we reboot the test setup and then, once it is booted again, we try to load the men_lx_z25 module and in works.
men@men-MEN-F026L00:~$ sudo modprobe men_lx_z25
[sudo] password for men:
men@men-MEN-F026L00:~$
And the dmesg log:
[ 82.119492] men_oss: loading out-of-tree module taints kernel.
[ 82.119637] men_oss: module verification failed: signature and/or required key missing - tainting kernel
[ 82.122075] MEN men_oss init_module
[ 82.174991] MEN men_chameleon init_module
[ 82.300771] MEN men_chameleon_io init_module
[ 82.406422] Init MEN Chameleon PNP subsystem
We are working in a solution that may fix both issues as they are caused by the same Makefile.
The functional tests in test setup 1 have passed OK (except one test that depends on a private module that is not configured)
<testsuites disabled="0" errors="0" failures="1" tests="13" time="0.0">
After doing a complete check and redone of what I have done 3 times, it looks like kernel 5.11 doesn't allow the modules to load :