Frogging-Family / nvidia-all

Nvidia driver latest to 396 series AIO installer
768 stars 69 forks source link

Fails to load on 5.14.9 Clang LTO TKG Kernel #61

Open C43H66N12O12S2 opened 2 years ago

C43H66N12O12S2 commented 2 years ago

Both the DKMS and regular flavor of both the stable branch and vulkan dev branch fails to load with this message:

kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_rax (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_r8 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_rdx (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_rcx (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_r10 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_r9 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_rdi (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_r11 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_rsi (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_rbx (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_jmp_r12 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_rax (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r8 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_rbx (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r13 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r9 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_rdx (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_rcx (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r10 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r12 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r11 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r15 (err -2)
kernel: nvidia: Unknown symbol __x86_indirect_alt_call_r14 (err -2)

The very same package works just fine with 5.14.8 TKG (also Clang and LTO).

The regular version building fails with "different compilers" error, even though both the kernel and the module is being built by the exact same compiler and version, so maybe that's a clue?

I'm happy to provide any additional information required.

Tk-Glitch commented 2 years ago

Clang built kernels aren't really supported, even less so with Nvidia

C43H66N12O12S2 commented 2 years ago

If anybody else is having this issue, 5.15-rc4 seems to work fine.

C43H66N12O12S2 commented 2 years ago

I've identified the issue. Disabling retpoline (called something like "Avoid speculative indirect branches in kernel" in Architecture options) causes this. @Tk-Glitch you might want to include this information in either nvidia-all or linux-tkg documentation.

flindeberg commented 2 years ago

@C43H66N12O12S2 Can you verify that this still is the case with latest DKMS? (i.e. DKMS >= 3.0.2)

C43H66N12O12S2 commented 2 years ago

@flindeberg I haven't tested it, but I doubt it's fixed. To be clear, I highly doubt this is a Clang issue and it's definitely not a DKMS issue, as I've stated in the OP, normal kernel modules (without DKMS) failed with the same error. This seems to be more of a quirk of the nvidia driver, and it's dependent on one setting in the kernel configuration, disabling that solves it for DKMS/Non-DKMS, Clang and so on.

The reason I've left this issue open is to help anybody else who encountered the same issue. Otherwise, this issue is actually resolved.