dslm4515 / CMLFS

Clang-Built Musl Linux From Scratch
MIT License
105 stars 18 forks source link

Unable to Compile Kernel #85

Open dslm4515 opened 1 year ago

dslm4515 commented 1 year ago

Kernel 6.3 fails to compile with LLVM ( LLVM=1 LLVM_IAS=1 make):

make[3]: 'install_headers' is up to date.
  LD [M]  arch/x86/events/amd/amd-uncore.o
ld.lld: error: arch/x86/events/amd/uncore.o: invalid symbol index
make[4]: *** [scripts/Makefile.build:452: arch/x86/events/amd/amd-uncore.o] Error 1
make[3]: *** [scripts/Makefile.build:494: arch/x86/events/amd] Error 2
make[2]: *** [scripts/Makefile.build:494: arch/x86/events] Error 2
make[1]: *** [scripts/Makefile.build:494: arch/x86] Error 2
make: *** [Makefile:2025: .] Error 2

Let's try GCC & binutils (installed in /opt/gnu):

make[3]: 'install_headers' is up to date.
  LD [M]  arch/x86/events/amd/amd-uncore.o
ld: arch/x86/events/amd/uncore.o: bad reloc symbol index (0x6d >= 0x6d) for offset 0x3 in section `.text'
ld: arch/x86/events/amd/uncore.o: bad reloc symbol index (0x6d >= 0x6d) for offset 0x3 in section `.text'
ld: failed to set dynamic section sizes: bad value
make[4]: *** [scripts/Makefile.build:452: arch/x86/events/amd/amd-uncore.o] Error 1
make[3]: *** [scripts/Makefile.build:494: arch/x86/events/amd] Error 2
make[2]: *** [scripts/Makefile.build:494: arch/x86/events] Error 2
make[1]: *** [scripts/Makefile.build:494: arch/x86] Error 2
make: *** [Makefile:2025: .] Error 2

Or LLVM+elftoolschain? (LLVM=1 make)

make[3]: 'install_headers' is up to date.
  LD [M]  arch/x86/events/amd/amd-uncore.o
ld.lld: error: arch/x86/events/amd/uncore.o: invalid symbol index
make[4]: *** [scripts/Makefile.build:452: arch/x86/events/amd/amd-uncore.o] Error 1
make[3]: *** [scripts/Makefile.build:494: arch/x86/events/amd] Error 2
make[2]: *** [scripts/Makefile.build:494: arch/x86/events] Error 2
make[1]: *** [scripts/Makefile.build:494: arch/x86] Error 2
make: *** [Makefile:2025: .] Error 2

Maybe a typo or bug in kernel 6.3? Same errors for kernel 6.3.2

BUT kernel 6.1.8 compiles fine (LLVM=1 LLVM_IAS=1 make).

dslm4515 commented 1 year ago

Let's disable CONFIG_PERF_EVENTS_AMD_UNCORE in the kernel config...

Compile goes a bit further but another error found:

make[3]: 'install_headers' is up to date.
  AR      arch/x86/built-in.a
  LD [M]  arch/x86/kvm/kvm.o
ld.lld: error: arch/x86/kvm/x86.o: invalid symbol index
make[3]: *** [scripts/Makefile.build:452: arch/x86/kvm/kvm.o] Error 1
make[2]: *** [scripts/Makefile.build:494: arch/x86/kvm] Error 2
make[1]: *** [scripts/Makefile.build:494: arch/x86] Error 2
make: *** [Makefile:2025: .] Error 2
dslm4515 commented 1 year ago

Let's disable virtualization, CONFIG_VIRTUALIZATION...

so far no errors...

... another obstacle:

make[3]: 'install_headers' is up to date.
  LD [M]  crypto/ecdh_generic.o
ld.lld: error: crypto/ecdh.o: invalid symbol index
make[2]: *** [scripts/Makefile.build:452: crypto/ecdh_generic.o] Error 1
make[1]: *** [scripts/Makefile.build:494: crypto] Error 2
make[1]: *** Waiting for unfinished jobs....
  LD [M]  fs/isofs/isofs.o
  LD [M]  fs/squashfs/squashfs.o
ld.lld: error: fs/isofs/inode.o: invalid symbol index
make[3]: *** [scripts/Makefile.build:452: fs/isofs/isofs.o] Error 1
make[2]: *** [scripts/Makefile.build:494: fs/isofs] Error 2
make[2]: *** Waiting for unfinished jobs....
ld.lld: error: fs/squashfs/cache.o: invalid symbol index
make[3]: *** [scripts/Makefile.build:452: fs/squashfs/squashfs.o] Error 1
make[2]: *** [scripts/Makefile.build:494: fs/squashfs] Error 2
make[1]: *** [scripts/Makefile.build:494: fs] Error 2
make: *** [Makefile:2025: .] Error 2
dslm4515 commented 1 year ago

So far ... 6.1.8 -> No errors. Success 6.1.29 -> No errors. Success 6.2.9 -> fail 6.3.0 -> fail 6.3.2 -> fail 6.3.3 -> Fail 6.4-r2 -> Fail

Used the same kernel config for each kernel version. I made sure to run LLVM=1 make oldconfig to make sure kernel config was compatible with the source tree.

Looks like starting with kernel version 6.2.x, the same errors pop up

takusuman commented 1 year ago

I haven't played around with LLVM in a while, but what about using Clang/Clang++ with GNU Binutils? I know it's not the ideal, but it would be a workaround while the LLVM and elftoolchain folks doesn't fix this bug.

Funny enough, Linux v6.1.8 compiles fine, per what you've said, so we could also just go with it instead of insisting in other versions. By the way, what about v5.15.103? Does it also compile fine?

dslm4515 commented 1 year ago

That's true. I have tried other combinations but not yet LLVM+Binutils. That will be my next test

dslm4515 commented 1 year ago

Same issue with LLVM+Binutils, "invalid symbol index"

cement-drinker commented 1 year ago

is it an issue with the linker?

dslm4515 commented 1 year ago

is it an issue with the linker?

If kernel version 6.1.29 compiles fine, I would like to say no... unless something in the kernel source changed starting with 6.2.x and later.

cement-drinker commented 1 year ago

Kernel updates are usually big, so maybe something did. Lets first try a diff linker (eg. mold), and if that doesnt work, follow alpine linux

cement-drinker commented 1 year ago

update: mold requires a SSL library for libcrypto, and that doesnt need to be in a toolchain. probably compile libreSSL and clang, and then compile mold, then proceed with the rest of stage2

dslm4515 commented 1 year ago

Wow. Never heard of mold... Ha, I guess I might consider replacing LLD with mold.

Let's just say I'm not a fan of rust, so I'm impressed mold is coded in c++

cement-drinker commented 1 year ago

Mold might not be the reason this is happening. There are some patches on chimera linux for the latest kernel, so lets try those

takusuman commented 1 year ago

Wow. Never heard of mold... Ha, I guess I might consider replacing LLD with mold.

Let's just say I'm not a fan of rust, so I'm impressed mold is coded in c++

I'd also propose that we just use GNU Binutils' ld(1) instead of LLVM's lld(1), like https://github.com/ClangBuiltLinux/tc-build does.

takusuman commented 1 year ago

Mold might not be the reason this is happening. There are some patches on chimera linux for the latest kernel, so lets try those

Yeah, checking for Chimera patches might also be a good idea.

dslm4515 commented 1 year ago

Looks like mold isn't supported by the kernel source:

mold: unknown linker
scripts/Kconfig.include:56: Sorry, this linker is not supported.
dslm4515 commented 1 year ago

Per kisslinux.org:

It can not be used to link the Linux
kernel (due to lack of linker script support)
dslm4515 commented 1 year ago

I only see kernel 6.1.38 for Chimera's linux-lts kernel and 6.1.32 for the linux-rpi kernel ... in the github repo

takusuman commented 1 year ago

I only see kernel 6.1.38 for Chimera's linux-lts kernel and 6.1.32 for the linux-rpi kernel ... in the github repo

Well, what linker are they using? I've suggested to use clang and later link using GNU's ld.bfd, unless 6.1.38 already supports LLVM's ld natively 👀 .

By the way, how the progress on using BSD's elftoolchain is going? I've read 'bout it on Musl-LFS's enhancement issues.

dslm4515 commented 1 year ago

I've already tried LLD and ld.bfd... same problem. In fact, I cannot compile the latest kernel (starting with 6.2.x) with both GCC and clang. My GCC is installed in /opt/gnu but compiles other packages just fine (so it's not broken?).

I have yet to try a 6.2.x+ kernel build on an older MLFS system.

Last I checked, Chimera Linux still used LLD.

dslm4515 commented 1 year ago

Looks like Chimera Linux uses 6.4.x for their kernel-stable... not sure how I missed it. I was only seeing their build for kernel-LTS. There are lots of patches used for the 6.4.x kernel source

dslm4515 commented 1 year ago

Successfully compile kernel-6.4.16 with just a single patch from Chimera Linux: 0001-fix-gelf_update_symshndx-with-elftoolchain.patch

Although, I wonder if patch really helped or did something change again, starting with kernel 6.4.x

dslm4515 commented 1 year ago

Going to compile 6.5.4 with out patching ....

Looks like that chimera-linux patch fixed those warnings when compiling:

warning: objtool: gelf_update_symshndx: Invalid argument

dslm4515 commented 1 year ago

Kernel 6.5.4 compiled without errors, using no patches from Chimera Linux.

I am closing this unless someone needs to use kernels 6.2.x and 6.3.x

dslm4515 commented 6 months ago

So far, I have stuck to Kernel 6.5.9 as it compiles fine under my previous CMLFS build (LLVM-15.0.5). I built another CMLFS system but this time with LLVM-17.0.6 (and updated packages). Now kernel-6.5.9 wont compile.

I checked Chimera-Linux. Looks like at this time of this writing, Chimera is on Kernel 6.8.6. I download that kernel and patches from Chimera. Kernel appears to compile fine. But when linking up vmlinux.o, build fails:

  LD      vmlinux.o
vmlinux.o: warning: objtool: elf_getdata: Invalid section descriptor
make[2]: *** [scripts/Makefile.vmlinux_o:52: vmlinux.o] Error 1
make[2]: *** Deleting file 'vmlinux.o`
make[1]: *** [/sources/chimera-6.8.6/linux-6.8.6/Makefile:1142: vmlinux_o] Error 2
make: *** [Makefile:240: __sub-make] Error 2
dslm4515 commented 6 months ago

Issue fixed: LTO turned off. Now vmlinux.o links fine... when using Chimera's default x86_64 config