dynup / kpatch

kpatch - live kernel patching
GNU General Public License v2.0
1.47k stars 302 forks source link

Kernel Warning: Unpatched return thunk in use. This should not happen! #1404

Open blitz opened 1 month ago

blitz commented 1 month ago

When livepatching Linux 6.10, I get the warning below from the kernel. This seems to be harmless.

My kernel config: kernel-config.txt The patch: 0001-XXX-kvm-log-PIT-creation.patch.txt GCC: 13.2.0

[ 5229.900430] livepatch_7z4lq0wn99kgss3d32qjv1hn15xzl934_0001_XXX_kvm: tainting kernel with TAINT_LIVEPATCH
[ 5229.900929] ------------[ cut here ]------------
[ 5229.900930] Unpatched return thunk in use. This should not happen!
[ 5229.900933] WARNING: CPU: 2 PID: 35461 at arch/x86/kernel/cpu/bugs.c:3023 __warn_thunk+0x2c/0x40
[ 5229.900940] Modules linked in: livepatch_7z4lq0wn99kgss3d32qjv1hn15xzl934_0001_XXX_kvm(OK+) xt_mark ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq qrtr af_packet xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables cmac algif_hash algif_skcipher af_alg bnep msr snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir mt7921e snd_sof_amd_acp mt7921_common snd_sof_pci snd_sof_xtensa_dsp mt792x_lib snd_sof mt76_connac_lib mt76 snd_sof_utils snd_pci_ps snd_amd_sdw_acpi mac80211 snd_hda_codec_realtek soundwire_amd soundwire_generic_allocation soundwire_bus snd_hda_codec_generic snd_hda_scodec_component snd_soc_core snd_hda_codec_hdmi snd_usb_audio snd_compress hid_sensor_als ac97_bus snd_hda_intel uvcvideo snd_pcm_dmaengine hid_sensor_trigger industrialio_triggered_buffer snd_rpl_pci_acp6x snd_intel_dspcfg kfifo_buf snd_acp_pci hid_sensor_iio_common snd_intel_sdw_acpi snd_acp_legacy_common
[ 5229.900970]  industrialio videobuf2_vmalloc snd_hda_codec sch_fq_codel snd_usbmidi_lib cros_usbpd_charger snd_pci_acp6x uvc cfg80211 hid_multitouch hid_sensor_hub intel_rapl_msr btusb snd_ump cros_ec_sysfs snd_pci_acp5x videobuf2_memops cros_ec_chardev snd_hda_core cros_ec_debugfs cros_usbpd_notify cros_usbpd_logger snd_rawmidi btrtl videobuf2_v4l2 gpio_cros_ec btintel amd_pmf snd_rn_pci_acp3x snd_seq_device snd_hwdep nls_iso8859_1 sp5100_tco snd_acp_config videodev btbcm amdtee nls_cp437 snd_pcm watchdog ucsi_acpi snd_soc_acpi amd_sfh vfat typec_ucsi edac_mce_amd btmtk bluetooth edac_core amd_atl intel_rapl_common videobuf2_common cdc_acm snd_timer crc32_pclmul polyval_clmulni rfkill polyval_generic gf128mul fat snd ghash_clmulni_intel crc16 joydev mousedev mc rapl wmi_bmof framework_laptop(O) typec tpm_crb tiny_power_button platform_profile k10temp i2c_piix4 snd_pci_acp3x soundcore libarc4 battery ac thermal i2c_hid_acpi roles loop tpm_tis button tpm_tis_core i2c_hid tun tap amd_pmc evdev macvlan mac_hid serio_raw
[ 5229.901008]  bridge cros_ec_dev stp llc kvm_amd ccp kvm cros_ec_lpcs cros_ec fuse efi_pstore configfs nfnetlink zram efivarfs dmi_sysfs ip_tables x_tables autofs4 xfs libcrc32c crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core libaescfb ecdh_generic ecc hid_generic usbhid hid input_leds led_class nvme atkbd xhci_pci libps2 xhci_pci_renesas vivaldi_fmap thunderbolt crc32c_intel nvme_core sha512_ssse3 sha256_ssse3 sha1_ssse3 xhci_hcd aesni_intel nvme_auth t10_pi crypto_simd cryptd crc64_rocksoft i8042 crc64 crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common rtc_cmos serio amdgpu video wmi backlight amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper drm_buddy drm_display_helper firmware_class cec dm_mod dax
[ 5229.901041] CPU: 2 PID: 35461 Comm: insmod Tainted: G           O  K    6.10.0 #1-NixOS
[ 5229.901043] Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
[ 5229.901044] RIP: 0010:__warn_thunk+0x2c/0x40
[ 5229.901046] Code: 1f 00 0f 1f 44 00 00 80 3d cb 0f dd 01 00 74 05 c3 cc cc cc cc c6 05 bd 0f dd 01 01 90 48 c7 c7 60 23 d0 85 e8 e5 cf 05 00 90 <0f> 0b 90 90 c3 cc cc cc cc 66 2e 0f 1f 84 00 00 00 00 00 90 90 90
[ 5229.901048] RSP: 0018:ffffa1644414fc98 EFLAGS: 00010286
[ 5229.901049] RAX: 0000000000000000 RBX: ffff8d17435a6ca0 RCX: 0000000000000027
[ 5229.901050] RDX: ffff8d1e9e320988 RSI: 0000000000000001 RDI: ffff8d1e9e320980
[ 5229.901051] RBP: ffffa1644414fce8 R08: 0000000000000000 R09: 0000000000000003
[ 5229.901052] R10: ffffa1644414fb40 R11: ffffffff86533c68 R12: 61c8864680b583eb
[ 5229.901052] R13: ffffa1644414fd30 R14: ffff8d1b1ddae100 R15: ffff8d1bf1cb8d38
[ 5229.901053] FS:  00007f8d2ab0e200(0000) GS:ffff8d1e9e300000(0000) knlGS:0000000000000000
[ 5229.901054] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5229.901055] CR2: 0000000000414870 CR3: 0000000377cf6000 CR4: 0000000000f50ef0
[ 5229.901055] PKRU: 55555554
[ 5229.901056] Call Trace:
[ 5229.901059]  <TASK>
[ 5229.901063]  ? __warn+0x80/0x120
[ 5229.901068]  ? __warn_thunk+0x2c/0x40
[ 5229.901069]  ? report_bug+0x164/0x190
[ 5229.901075]  ? handle_bug+0x3b/0x70
[ 5229.901078]  ? exc_invalid_op+0x17/0x70
[ 5229.901079]  ? asm_exc_invalid_op+0x1a/0x20
[ 5229.901086]  ? __warn_thunk+0x2c/0x40
[ 5229.901087]  ? __warn_thunk+0x2b/0x40
[ 5229.901088]  warn_thunk_thunk+0x1a/0x30
[ 5229.901094]  patch_init+0x8f/0xff0 [livepatch_7z4lq0wn99kgss3d32qjv1hn15xzl934_0001_XXX_kvm]
[ 5229.901096]  ? _note_14+0x19ec/0x19ec [livepatch_7z4lq0wn99kgss3d32qjv1hn15xzl934_0001_XXX_kvm]
[ 5229.901098]  ? do_one_initcall+0x58/0x320
[ 5229.901100]  ? do_init_module+0x90/0x270
[ 5229.901105]  ? init_module_from_file+0x86/0xc0
[ 5229.901107]  ? idempotent_init_module+0x120/0x2b0
[ 5229.901109]  ? __x64_sys_finit_module+0x5e/0xb0
[ 5229.901111]  ? do_syscall_64+0xb2/0x200
[ 5229.901112]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 5229.901115]  </TASK>
[ 5229.901115] ---[ end trace 0000000000000000 ]---
[ 5229.901506] livepatch: enabling patch 'livepatch_7z4lq0wn99kgss3d32qjv1hn15xzl934_0001_XXX_kvm'
[ 5229.903610] livepatch: 'livepatch_7z4lq0wn99kgss3d32qjv1hn15xzl934_0001_XXX_kvm': starting patching transition
tpressure commented 1 month ago

I have the same issue with Linux 6.9.8 and 6.10 even when I simply use the patch from the quick start guide

$ cat meminfo-string.patch
Index: src/fs/proc/meminfo.c
===================================================================
--- src.orig/fs/proc/meminfo.c
+++ src/fs/proc/meminfo.c
@@ -95,7 +95,7 @@ static int meminfo_proc_show(struct seq_
        "Committed_AS:   %8lu kB\n"
        "VmallocTotal:   %8lu kB\n"
        "VmallocUsed:    %8lu kB\n"
-       "VmallocChunk:   %8lu kB\n"
+       "VMALLOCCHUNK:   %8lu kB\n"
 #ifdef CONFIG_MEMORY_FAILURE
        "HardwareCorrupted: %5lu kB\n"
 #endif

From a quick look into the kernel source, it seems to be related to retpoline mitigations. Not sure how to fix this though.

joe-lawrence commented 1 month ago

Hi @blitz , thanks for the report.

Are you running kpatch-build with any out-of-tree patches? I suspect the problem is found in the CONFIG_X86_KERNEL_IBT=y kernel config. Unfortunately that is not fully supported by kpatch-build at the moment. (There is a discussion on this in the https://github.com/dynup/kpatch/issues/1320 issue.)

Are you using kpatch in a production environment, or looking to learn more about it. If you can turn off CONFIG_X86_KERNEL_IBT you may have better luck for the time being.

github-actions[bot] commented 1 week ago

This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added.

blitz commented 1 week ago

Still an issue.

joe-lawrence commented 1 week ago

Hi @blitz : thanks for reopening. Did you ever get to retry this case with CONFIG_X86_KERNEL_IBT unset?