ClangBuiltLinux / linux

Linux kernel source tree
Other
241 stars 14 forks source link

Building VirtualBox kernel modules with Clang and DKMS #1104

Closed torvic9 closed 2 years ago

torvic9 commented 4 years ago

Hi,

I don't know where to report this, but...let me try :) feel free to close if it's the wrong place.

So I finally managed to build current 5.7.10 kernel with LLVM=1 LLVM_IAS=1 with clang11 from the release/11.x branch, and the IAS related patches I found (among others) here: https://github.com/ClangBuiltLinux/linux/issues/1049 , and which I assembled together here: https://github.com/torvic9/linux57-vd/blob/master/0006-clang-llvm-ias-support.patch

The problem is that I cannot build the VBox host modules using CC=clang LD=ld.lld. I get the following errors:

./arch/x86/include/asm/page_64.h:49:2: error: expected '(' after 'asm'
        alternative_call_2(clear_page_orig,
        ^
/arch/x86/include/asm/segment.h:266:2: error: expected '(' after 'asm'
        alternative_io ("lsl %[seg],%[p]",
        ^

./arch/x86/include/asm/special_insns.h:212:2: error: expected '(' after 'asm'
        alternative_io(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
        ^

and many more of the same type.

Compiling the kernel with clang10 and without LLVM_IAS results in working VBox modules.

Is this an issue with the fact that I used LLVM_IAS for the kernel? Or is it rather related to using clang11 instead of clang10?

Thanks!

nickdesaulniers commented 4 years ago

So this issue is only observed with clang-11, but not clang-10? If so, that's curious. Are you building with both explicitly suffixed? Ie. make CC=clang-11 and make CC=clang-10 as opposed to make CC=clang?

torvic9 commented 4 years ago

I'm using LLVM=1 (and additionally LLVM_IAS=1 with clang-11).

nathanchance commented 4 years ago

I have re-read this thread a couple times and I cannot follow it.

Is there any way to get a TL;DR with some commands to reproduce this issue locally? If this is a LLVM regression from 10.0.x to 11.0.x, we might have some time to get this fixed in ToT then backport to LLVM 11.0.0 before it ships. If not, we should ideally shoot for LLVM 11.0.1. We are committing to supporting the latest stable release from LLVM (patch) so it would be nice not to have a regression in LLVM 11 right off the bat :)

torvic9 commented 4 years ago

Hi Nathan,

I have not tested it for some time now, but let me try to give you a TL;DR. Please note that I'm not very knowledgeable about this topic, I cannot exclude the possibility that the error comes from my system.

  1. build clang (here: with ThinLTO and PGO) with the build-llvm.py script using release-11.x branch
  2. build the 5.8 kernel with that toolchain (make LLVM=1)
  3. use DKMS to build the VBox modules by passing CC=clang LD=ld.lld to DKMS' make command

Result: a lot of ASM related error messages as mentioned in OP. I can post the whole build log file later this week.

Instead, when using clang-10 from the Arch repos to build the kernel (version 10.0.1 at the time of writing), the VBox modules build just fine without errors. Also builds fine with clang-10 built with the build-llvm.py script, although it's been more than a month since I last built clang-10.

Not related to IAS btw, as the errors pop up with and without using LLVM_IAS=1 with clang-11.

Hope this helps.

nathanchance commented 4 years ago

I am starting to think this might be something with DKMS because it builds fine without it? If I am holding it wrong, let me know. I would like to avoid installing virtualbox-dkms on my server.

$ cd "$(mktemp -d)"

$ git clone https://github.com/ClangBuiltLinux/tc-build
...

$ tc-build/build-llvm.py --assertions --branch "release/11.x" --build-stage1-only --projects "clang;lld" --targets X86
...

$ curl -LSs https://download.virtualbox.org/virtualbox/6.1.12/VirtualBox-6.1.12a.tar.bz2 | tar -xjf -

$ curl -LSs https://github.com/archlinux/svntogit-community/raw/20aca7aca83f45bb1505c44955a3614d461f171f/trunk/021-linux-5-8.patch | patch -d VirtualBox-6.1.12 -p1
...

$ VirtualBox-6.1.12/src/VBox/HostDrivers/linux/export_modules.sh --folder vboxmod
VirtualBox-6.1.12/src/VBox/HostDrivers/linux/export_modules.sh: 120: [: Illegal number:
VirtualBox-6.1.12/src/VBox/HostDrivers/linux/export_modules.sh: 204: [: Illegal number:

$ curl -LSs https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.8.3.tar.xz | tar -xJf -

$ PATH=${PWD}/tc-build/build/llvm/stage1/bin:${PATH} make -C linux-5.8.3 -skj"$(nproc)" LLVM=1 defconfig bzImage

$ PATH=${PWD}/tc-build/build/llvm/stage1/bin:${PATH} make -C linux-5.8.3 -skj"$(nproc)" LLVM=1 M=${PWD}/vboxmod
/tmp/tmp.l1J0MXwOsz/vboxmod/vboxdrv/SUPDrvTracer.o: warning: objtool: .text+0x7: indirect jump found in RETPOLINE build
/tmp/tmp.l1J0MXwOsz/vboxmod/vboxdrv/SUPDrvTracer.o: warning: objtool: supdrvTracerProbeFireStub() is missing an ELF size annotation
/tmp/tmp.l1J0MXwOsz/vboxmod/vboxdrv/r0drv/linux/memuserkernel-r0drv-linux.o: warning: objtool: VBoxHost_RTR0MemKernelCopyFrom()+0xe: redundant CLD
/tmp/tmp.l1J0MXwOsz/vboxmod/vboxdrv/r0drv/linux/memuserkernel-r0drv-linux.o: warning: objtool: VBoxHost_RTR0MemKernelCopyTo()+0xe: redundant CLD
WARNING: Symbol version dump "Module.symvers" is missing.
         Modules may not have dependencies or modversions.
WARNING: modpost: Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped.

$ fd -e ko . vboxmod
vboxmod/vboxdrv/vboxdrv.ko
torvic9 commented 4 years ago

I repeated your procedure, except for the toolchain build, and effectively I don't get any errors either. So it could be DKMS, or maybe even the kernel config?

nathanchance commented 4 years ago

It certainly could be configuration related. If you want to upload that, I can try it locally.

torvic9 commented 4 years ago

I'm now trying with the default Arch config and I'll report back. I guess it's enough to build bzImage only, without the modules?

EDIT: Arch config works, too. EDIT 2: Mine works as well. I think you're right - this could be an issue with DKMS.

nickdesaulniers commented 4 years ago

@ColinIanKing I think mentioned encountering issues with DKMS when building with Clang. I don't know the details, @ColinIanKing was that ever reported/resolved?

dileks commented 4 years ago

@torvic9

Which version of DKMS?

torvic9 commented 4 years ago
pacman -Qi dkms | grep -i vers                     
Version                  : 2.8.3-1

@nickdesaulniers , you probably mean this: https://github.com/dell/dkms/issues/124

nickdesaulniers commented 3 years ago

It would be good to have precise steps to reproduce; I'm not really familiar with DKMS, so it's not clear to me how to reproduce which is necessary to even get started.

ColinIanKing commented 3 years ago

I'll try and post some examples in the next day or so, the particular example we have is zfs from the zfsutils-linux debian or ubuntu source package. But I'll try and find some simpler examples that don't take forever.

torvic9 commented 3 years ago

I've just tried this again, with VirtualBox 6.1.20 and DKMS. Kernel 5.12 compiled with clang 12.0.1, running on Arch.

DKMS works with the following addition to dkms.conf:

LLVM_UTILS="CC=clang CXX=clang++ LD=ld.lld AR=llvm-ar NM=llvm-nm OBJCOPY=llvm-objcopy OBJSIZE=llvm-size STRIP=llvm-strip"
echo "MAKE[0]=\"make ${LLVM_UTILS} KERNELRELEASE=${_kernver}\"" >> $srcdir/dkms.conf

($_kernver simply points to a version file which contains the name of the kernel)

However, and this is quite remarkable, it only works when the kernel is NOT compiled with LTO. With an LTO kernel, I get ld.lld: error: permission denied during the final linking phase, without any further explanation. I have no clue what's going on.

ColinIanKing commented 3 years ago

https://wiki.kubuntu.org/Kernel/Dev/DKMSPackaging is a good article on getting a simple hello world dkms example built

ColinIanKing commented 3 years ago

I built ubuntu 5.11.0-3 with clang, installed the kernel in Ubuntu 20.04 Hirsute and then tinkered with a simple hello world DKMS build using info from torvic9 (above).

https://kernel.ubuntu.com/~cking/hello-0.1.tar is a simple example DKMS demo that I put into /usr/src and then did:

sudo dkms add -m hello -v 0.1
sudo dkms build -m hello -v 0.1
sudo dkms install -m hello -v 0.1
sudo modprobe hello

So this works fine, would be nice if dkms has the smarts to set the environment correctly for clang automatically

dileks commented 3 years ago

That LLD ... permission denied seems to be a local or distro specific problem.

Here I am successful:

root# git diff /usr/src/virtualbox-6.1.20/dkms.conf.orig /usr/src/virtualbox-6.1.20/dkms.conf
diff --git a/usr/src/virtualbox-6.1.20/dkms.conf.orig b/usr/src/virtualbox-6.1.20/dkms.conf
index eedda93..98bb523 100644
--- a/usr/src/virtualbox-6.1.20/dkms.conf.orig
+++ b/usr/src/virtualbox-6.1.20/dkms.conf
@@ -1,6 +1,13 @@
 PACKAGE_NAME="virtualbox"
 PACKAGE_VERSION="6.1.20"
-CLEAN="rm -f *.*o"
+LLVM_VER="12"
+CC_FOR_BUILD="clang-$LLVM_VER"
+LD_FOR_BUILD="ld.lld-$LLVM_VER"
+LLVM_UTILS="AR=llvm-ar-$LLVM_VER NM=llvm-nm-$LLVM_VER OBJCOPY=llvm-objcopy-$LLVM_VER OBJDUMP=llvm-objdump-$LLVM_VER READELF=llvm-readelf-$LLVM_VER STRIP=llvm-strip-$LLVM_VER"
+LLVM_IAS="LLVM_IAS=1"
+MAKE_OPTS="CC=${CC_FOR_BUILD} LD=${LD_FOR_BUILD} ${LLVM_UTILS} ${LLVM_IAS}"
+MAKE[0]="make V=1 ${MAKE_OPTS} -C ${kernel_source_dir} M=${dkms_tree}/${PACKAGE_NAME}/${PACKAGE_VERSION}/build"
+CLEAN="make -C ${kernel_source_dir} M=${dkms_tree}/${PACKAGE_NAME}/${PACKAGE_VERSION}/build clean"
 BUILT_MODULE_NAME[0]="vboxdrv"
 BUILT_MODULE_LOCATION[0]="vboxdrv"
 DEST_MODULE_LOCATION[0]="/updates"

Building and installing virtualboxDKMS module with above modification (here: Clang-LTO built linux-image):

root# dkms build -m virtualbox -v 6.1.20 -k 5.12.0-1-amd64-clang12-lto

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...
make -j4 KERNELRELEASE=5.12.0-1-amd64-clang12-lto V=1 CC=clang-12 LD=ld.lld-12 AR=llvm-ar-12 NM=llvm-nm-12 OBJCOPY=llvm-objcopy-12 OBJDUMP=llvm-objdump-12 READELF=llvm-readelf-12 STRIP=llvm-strip-12 LLVM_IAS=1 -C /lib/modules/5.12.0-1-amd64-clang12-lto/build M=/var/lib/dkms/virtualbox/6.1.20/build.........................
cleaning build area...

DKMS: build completed.

root# dkms install -m virtualbox -v 6.1.20 -k 5.12.0-1-amd64-clang12-lto

vboxdrv.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.12.0-1-amd64-clang12-lto/updates/dkms/

vboxnetadp.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.12.0-1-amd64-clang12-lto/updates/dkms/

vboxnetflt.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.12.0-1-amd64-clang12-lto/updates/dkms/

depmod........

DKMS: install completed.

Listing new kernel-modules:

root# LC_ALL=C ll /lib/modules/5.12.0-1-amd64-clang12-lto/updates/dkms/
total 840K
drwxr-xr-x 2 root root 4.0K Apr 30 11:16 .
drwxr-xr-x 3 root root 4.0K Apr 30 11:16 ..
-rw-r--r-- 1 root root 746K Apr 30 11:16 vboxdrv.ko
-rw-r--r-- 1 root root  21K Apr 30 11:16 vboxnetadp.ko
-rw-r--r-- 1 root root  60K Apr 30 11:16 vboxnetflt.ko
dileks commented 3 years ago

Debian's latest 5.10 kernel works, too.

root# dkms status -m virtualbox
virtualbox, 6.1.20, 5.10.0-6-amd64, x86_64: installed
virtualbox, 6.1.20, 5.12.0-1-amd64-clang12-lto, x86_64: installed
torvic9 commented 3 years ago

seems to be a local or distro specific problem.

You are probably right, but I don't know what could be the cause :( (It's not a distro clang, but one built with build-llvm.py script.)

dileks commented 3 years ago

I was able to start my FreeDOS v1.2 image in VBox-Qt-Gui.

root# modprobe vboxdrv
root# lsmod | grep vbox
vboxdrv               557056  0
dileks commented 3 years ago

@torvic9

Dunno Arch Linux en detail and its vbox dkms.conf. You can try to adapt my above modifications.

Anyway, it is good to see a Clang-LTO enabled Linux-kernel works with VirtualBox v6.1.20.

torvic9 commented 3 years ago

Yes, I'm going to add V=1 to the (otherwise very similar) make command in order to hopefully get more info. I'll report back.

dileks commented 3 years ago

Just DO it :-).

torvic9 commented 3 years ago

V=1 does nothing.

@dileks, do you know if it is possible to disable LTO for the virtualbox modules only, maybe something like -fno-lto? It seems to automatically enable it:

  AR [M]  /home/unsorted/linux512-vd-virtualbox-modules/src/vboxhost/6.1.20_OSE/build/vboxdrv/vboxdrv.o
  LTO [M] /home/unsorted/linux512-vd-virtualbox-modules/src/vboxhost/6.1.20_OSE/build/vboxdrv/vboxdrv.lto.o
ld.lld: error: Permission denied
make[3]: *** [scripts/Makefile.modpost:121: [...]

EDIT: Nevermind! KBUILD_VERBOSE and/or VBOX_LNX_VERBOSE has to be set to 1 to get verbose output.

dileks commented 3 years ago

@torvic9

Dunno if this is a good idea to build the vbox module w/o LTO.

Interesting that passing V=1 does not show full make-lines when building vbox-dkms.

Hmmm, 6.1.20_OSE is that from upstream?

torvic9 commented 3 years ago

V=1 gets abbreviated to V=, probably a configuration/setup issue on my end.

The actual error message is:

  ld.lld -m elf_x86_64 --thinlto-cache-dir=.thinlto-cache -mllvm -import-instr-limit=5   
-r -o /home/unsorted/linux512-vd-virtualbox-modules/src/vboxhost/6.1.20_OSE/build/vboxdrv/vboxdrv.lto.o
 --whole-archive /home/unsorted/linux512-vd-virtualbox-modules/src/vboxhost/6.1.20_OSE/build/vboxdrv/vboxdrv.o
ld.lld: error: Permission denied

(yes, upstream via Arch - and I just saw that we have a new 6.1.22 release now)

EDIT: I'm giving this a break, as the LTO error is most certainly an issue on my side and not with either clang, dkms or vbox. Thanks for your help!

ColinIanKing commented 3 years ago

Perhaps the /home/unsorted/linux512-vd-virtualbox-modules/src/vboxhost/6.1.20_OSE/build/vboxdrv/vboxdrv.o obj file can't be overwritten. try manually removing it before kicking off a build.

torvic9 commented 3 years ago

The build directory is auto-cleaned on each try. Could it be this maybe: --thinlto-cache-dir=.thinlto-cache ? I cannot find such a directory anywhere.

nathanchance commented 3 years ago

What folder are you running the dkms command in? It is possible that the ThinLTO cache is being created there, rather than in the build folder, and the permissions might not be right. You can try to confirm if it is that by just removing that line from the Makefile; it is just to speed up incremental builds.

torvic9 commented 3 years ago

I'm using Arch's makepkg system, which, when executed runs dkms in $BUILDDIR which is a user choice. Normally, I simply use /tmp with 777 permissions. However, I suspect the VBox build system tries to create the thinlto cache folder in the VBox source directory, which is not user-writable as it resides in /usr/src and is only symlinked by dkms on build. I'm now looking to add -Wl,--thinlto-cache-dir=/tmp to LDFLAGS, but I don't know how this can be done by editing the Makefiles. EDIT: added it to EXTRA_LDFLAGS but lld complains that it is an unknown argument. EDIT2: it has to be used without -Wl -- it now goes one step further until an other permission denied occurs...

by just removing that line from the Makefile

Which line do you mean?

torvic9 commented 3 years ago

Please do not waste your time with my setup issue...

IMO this issue can be closed, Sedat showed that it works. This is more likely something for the dkms people. Thanks @ all.

torvic9 commented 3 years ago

Yes! I finally solved the issue by dirty-hacking the relevant scripts/Makefile.xxx kernel files, and adding --thinlto-cache-dir=/tmp/thinlto to their revelant LDFLAGS settings.

torvic9 commented 3 years ago

For future reference, especially for Arch users who want to keep using the native build system, here is a dirty kernel diff for 5.12 that works around the issue:

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 8cd67b1..771ba34 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -186,6 +186,11 @@ endif
 part-of-module = $(if $(filter $(basename $@).o, $(real-obj-m)),y)
 quiet_modtag = $(if $(part-of-module),[M],   )

+# Make sure that the ThinLTO cache is located in a user-writable location
+ifdef CONFIG_LTO_CLANG_THIN
+EXTRA_LDFLAGS += --thinlto-cache-dir=/tmp/thinlto
+endif # CONFIG_LTO_CLANG_THIN
+
 modkern_cflags =                                          \
    $(if $(part-of-module),                           \
        $(KBUILD_CFLAGS_MODULE) $(CFLAGS_MODULE), \
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 735e11e..619e6cd 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -35,6 +35,11 @@ ifdef CONFIG_LTO_CLANG
 # avoid a second slow LTO link
 prelink-ext := .lto

+# Make sure that the ThinLTO cache is located in a user-writable location
+ifdef CONFIG_LTO_CLANG_THIN
+KBUILD_LDFLAGS += --thinlto-cache-dir=/tmp/thinlto
+endif # CONFIG_LTO_CLANG_THIN
+
 # ELF processing was skipped earlier because we didn't have native code,
 # so let's now process the prelinked binary before we link the module.

-- 

--> Link to full patch file

dileks commented 3 years ago

Normally, you have a .thinlto directory in your kernel build-directory.

torvic9 commented 3 years ago

That directory is owned by root on Arch and not writable by the user. It's in /lib/modules/-kernelname- and symlinked to /usr/src. The alternative would be to copy the build dir to another location and make dkms point to it, but that's IMO more complicated (indeed, I haven't been successful with that).

(Note: I mean a kernel that is already properly installed with pacman)

dileks commented 3 years ago

Maybe that .thinlto dir location should be user-configurable via kbuild/kconfig?

Alternatively, you use sudo dkms ...?

torvic9 commented 3 years ago

Arch uses fakeroot dkms, for the Arch default kernel: https://github.com/archlinux/svntogit-community/blob/packages/virtualbox-host-modules-arch/trunk/PKGBUILD

Maybe that .thinlto dir location should be user-configurable via kbuild/kconfig?

Yes. I will create a patch to make it configurable.

dileks commented 3 years ago

fakeroot is an alternative (which I do no more use in my kernel build-script).

torvic9 commented 3 years ago

Here is a new diff. I really don't know much about the whole kernel things, but it seems to work in a first test:

diff --git a/Makefile b/Makefile
index 78b0941..70d4e0d 100644
--- a/Makefile
+++ b/Makefile
@@ -904,7 +904,8 @@ endif
 ifdef CONFIG_LTO_CLANG
 ifdef CONFIG_LTO_CLANG_THIN
 CC_FLAGS_LTO   := -flto=thin -fsplit-lto-unit
-KBUILD_LDFLAGS += --thinlto-cache-dir=$(extmod-prefix).thinlto-cache
+export thinlto-dir = $(if $(CONFIG_LTO_CLANG_THIN_CACHEDIR),$(CONFIG_LTO_CLANG_THIN_CACHEDIR)/)
+KBUILD_LDFLAGS += --thinlto-cache-dir=$(thinlto-dir)$(extmod-prefix).thinlto-cache
 else
 CC_FLAGS_LTO   := -flto
 endif
diff --git a/arch/Kconfig b/arch/Kconfig
index ecfd352..ae54c50 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -690,6 +690,16 @@ config LTO_CLANG_THIN
        https://clang.llvm.org/docs/ThinLTO.html

      If unsure, say Y.
+
+config LTO_CLANG_THIN_CACHEDIR
+   string "Clang ThinLTO cache directory"
+   depends on LTO_CLANG_THIN
+   default ""
+   help
+     This option allows users to choose a directory that stores
+     Clang's ThinLTO cache.
+     Leave empty for default.
+
 endchoice

 config HAVE_ARCH_WITHIN_STACK_FRAMES
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 8cd67b1..bed63db 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -186,6 +186,10 @@ endif
 part-of-module = $(if $(filter $(basename $@).o, $(real-obj-m)),y)
 quiet_modtag = $(if $(part-of-module),[M],   )

+ifdef CONFIG_LTO_CLANG_THIN
+KBUILD_LDFLAGS += --thinlto-cache-dir=$(thinlto-dir)$(extmod-prefix).thinlto-cache
+endif
+
 modkern_cflags =                                          \
    $(if $(part-of-module),                           \
        $(KBUILD_CFLAGS_MODULE) $(CFLAGS_MODULE), \
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 735e11e..da0995c 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -35,6 +35,10 @@ ifdef CONFIG_LTO_CLANG
 # avoid a second slow LTO link
 prelink-ext := .lto

+ifdef CONFIG_LTO_CLANG_THIN
+KBUILD_LDFLAGS += --thinlto-cache-dir=$(thinlto-dir)$(extmod-prefix).thinlto-cache
+endif
+
 # ELF processing was skipped earlier because we didn't have native code,
 # so let's now process the prelinked binary before we link the module.
torvic9 commented 3 years ago

I think this can be closed for now. For me, the problem is solved with the above patch. (PS: thanks @nickdesaulniers for the invitation to ClangBuiltLinux, but I do not have enough knowledge to help you with your project.)

nathanchance commented 3 years ago

I wonder if it is worth upstreaming that diff?

StatusCode404 commented 3 years ago

Hi All, I'm experiencing a similar problem and thought it was a Vbox issue and filed a bug with them.... https://www.virtualbox.org/ticket/20425

So who rightfully owns this problem? Is it Virtualbox?

@torvic9 is proposing to close this for now, but what is the procedure to fix this for the masses who are now trying Clang/llvm thin lto'ed kernels? Perhaps someone could outline in a reply the steps to resolve the problem? So just to be clear, we cannot run vbox guest vms on host kernels built with clang/llvm with thin lto. In my case, switching the default "cc" didn't fix the problem and "gcc" looks like it is hardcoded when trying to build the vboxdrv modules.

Thanks in advance if someone could write up the work-around procedure!

nathanchance commented 3 years ago

@StatusCode404 it is most likely something with DKMS: https://github.com/dell/dkms/issues/124

I have no idea what the workaround is. I was able to build the VirtualBox module without DKMS here, that might work for you.

torvic9 commented 3 years ago

Thanks in advance if someone could write up the work-around procedure!

Hi, you have to tell DKMS to use Clang instead of GCC. It is not an issue with VBox but rather with DKMS. As documented in the manpage, you can use MAKE[0] in the file dkms.conf for that. A simple example from my Arch PKGBUILD:

    LLVM_UTILS="CC=$CLPF/clang CXX=$CLPF/clang++ LD=$CLPF/ld.lld \
     AS=$CLPF/llvm-as AR=$CLPF/llvm-ar NM=$CLPF/llvm-nm \
     OBJCOPY=$CLPF/llvm-objcopy OBJSIZE=$CLPF/llvm-size \
     STRIP=$CLPF/llvm-strip"
    echo "MAKE[0]=\"make ${LLVM_UTILS} KERNELRELEASE=${_kernver}\"" >> $srcdir/dkms.conf

(In the above example, $CLPF is simply pointing to the path of the toolchain, whereas $LLVM_UTILS regroups all the different llvm/clang tools that are needed.)

StatusCode404 commented 3 years ago

Thanks for the pointer!

v-fox commented 3 years ago

@nathanchance

I wonder if it is worth upstreaming that diff?

This doesn't just breaks VB in DKMS on Arch, it breaks all 3rd-party module packages on Open Build Service, meaning that it breaks all 3rd-party, like zfs and nvidia-driver, in most/all distros. I've been using similar hack for 5.12 but it broke in 5.13. So it is very much worth upstreaming.

StatusCode404 commented 3 years ago

I agree, the fix needs to be able to cater for both gcc and clang-llvm-lld. Figure it out during execution and use the right tool.

Failing to do this will mean little adoption of clang-built linux kernels which seems to perform better in benchmarks.

torvic9 commented 3 years ago

@v-fox , well I can try to send the patch upstream for 5.14-rc1. Still, I think it's not really a kernel issue so I guess the chance of acceptance are small... anyone tried using KBUILD_EXTMOD instead?

As for gcc, no idea, there is currently no upstream LTO support for it unless I'm mistaken.

v-fox commented 3 years ago

well I can try to send the patch upstream for 5.14-rc1. Still, I think it's not really a kernel issue so I guess the chance of acceptance are small... anyone tried using KBUILD_EXTMOD instead?

As for gcc, no idea, there is currently no upstream LTO support for it unless I'm mistaken.

That is very much kernel's issue. Neither using clang nor enabling LTO should make out-of-tree modules unbuildable, deficiencies should be fixed. I rebased your patch for 5.13 for myself with using '/tmp' by default and it seems to work fine on OBS now. Putting toolchain overrides into every out-of-tree module package is still a necessity though but that's a separate issue. Not sure if it's possible to make out-of-tree modules somehow inherit toolchain settings from kernel.

torvic9 commented 3 years ago

Not sure if it's possible to make out-of-tree modules somehow inherit toolchain settings from kernel.

That would be the "correct" way to go in my opinion - but rather on the DKMS side. Is there a command that can extract this information from binaries (readelf?)?

Anyway, if people think that it's worth upstreaming, then I'm going to do so once 5.14 is mainlined.