lkrg-org / lkrg

Linux Kernel Runtime Guard
https://lkrg.org
Other
410 stars 72 forks source link

respect IRQ state #339

Closed m1lua closed 3 months ago

m1lua commented 3 months ago

Description

On loaded SMP systems this patch introduce safe behavior while in disabled IRQ state.

How Has This Been Tested?

This prevent Oops and false-positive LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 5192, name kvm at least on latest kernel

solardiz commented 3 months ago

Thank you @m1lua. Can you please share the Oops you were getting and explain what exactly was unsafe before and how exactly this patch makes it safe?

m1lua commented 3 months ago

sure.

Oops:

7246   │ 2,4907,12869940624,-;LKRG: ALERT: BLOCK: Module: Loading of module name tls
7247   │ 2,4908,12879097179,-;LKRG: ALERT: BLOCK: Module: Loading of module name spl
7248   │ 2,4909,12879171785,-;LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 8286, name kvm
7249   │ 2,4910,12879172473,-;LKRG: ALERT: BLOCK: Task: Killing pid 2461478442, name \x15Hx\x92\x01$&\xa6\xbd\x91\xf2\x88\xd5\x08\x16\x13r\xe6\x0c\xc9\xcc\xe46R\x1f^\xa2\x99\xfe@\xefZ\xb2\xd1\xb4\xc6\xfaj\xed\xe23\xd3\x95\xb1I\xd8\xc6\x
       │ b7ze#W\xf4\x9b\x84\x8c2\xd2K\xae\x15\xe4?\xed\x88\x9bZ\x865\xe5:\x1c\x97Z\xe30w\x95`\xad\x0f\x95\xad\x1c>`\x93\xa8\x18\x0e=\x5c\x9f\xb4\xdf\x88H\xf6\x08\xcb\xcc\xd8\xa6\xbe(\x11\x1b4\x0a+\xaa\x89\x5cMmSd|\xe4\x06R\xc3\xd5\x15y\
       │ xd2\xcf"m/\x813/W \xfe\xf8^KT|\xe4-\xdc\x14\x9aY\x9b\xd3D\xb6\xcb\x055\xc7\xd0\xe8\xca6W@\x9b\x92\xb7\xf8U \x83\xcbG\xe5*l\xa6\xc7\xf6-l\x9a\x11B\xe0'\x91\xd4P\x8aZ=\xaax[1\x86\xac\xf4\x0e~\x0d\xae\x9cM+R`$\x9a\xe37\x16g\x9f\x0
       │ 4\x16\x0e\xce\x0e\x1eo\x12\xa1+^\xd8+\x19>\xc6\x9b\xe9\xb9\x064\xb0\xc4.+\x98\xf9\x9dG)\xb5\xf5\x5c=k\x83>\xfc\xd3\x85\x7fZX\xac'U\xfc+\xfb\xdaI\xfa)K<-\xab'\xb8Lb\xdc\xd5\xa6\x14u\xb7\xa1\xc5\xf1'\xd9P_\xda\x11\xe7\xa1q\x10(\x
       │ 85\x0e\x83\xce/\xd9\xc1x28\xe8\x96L$\xbdt\x9e\xf8\x95\xee\xdf?\x93\x7f\xf1H\xc9\x05\xb1\xb30\xc1_\x0c\xc3h\xd0\xcfCmV\x85/\xc2`\xaaC\xce`4(\x07\x86\xe2l\x81\xf5\xf1\xa1K.\x89\xacm]\xf7a]\xd8\x9fv\x8c\xd1\x80X\x9c\xa6\xf20\xd8/\
       │ x8ei\x02\xb4\xf2\xa3\x16\xa0\xd2\x82\x05\x0ee~\xf3\x9do\xc3\xd3-3q\xc8]\x9b\x93\xb9\xa3\xb1\xf6\x97\xe9\xbf\xfc2\x9fC\x8a\xd4\xf5\x1d\xa7+4S\x81\xeaM\xc4\xfd\xf9\xf8\xe7AD6W\x16\xce\xf5\xdbI\xad\xf4Fz8\xca\x1d<\xf7\xdb\xd9\xbb\
       │ xd7=\xb65\xceT\xdcSNP\x05\x93\x91[\x8a\xd9hRw\x94\xea\x81<\x8c}\xcb<\x08\xdb\x80k8k\x80\xa62\x9c\xb2q\xd8\xab_\x06\x19\xe8\xe1\xea\x02gUf\x81% \xf3o\xb4Q\x9e:\x8f\xa6\xb6O"\x07X\x8b\x8f\x93-x\xceQ^\x8f\xa7\xf8\xfe\x86\x85{WD\xf
       │ 9l%w>\xddX\xe7\xa2\xaf\x0b\xe8\x8f\xe4t\x8cQ\xd5L\x88\xb4@\xf7\xcf\xc3\xb7\xa9\x8d\x9e\x96\xa7\x91\xa4f\xcdQ\x8b\xd4\x86N]\xc9\xb7l\xde\xa0\x19
7250   │ 4,4911,12879173873,-;Oops: general protection fault, probably for non-canonical address 0xd8e1a48104ea265c: 0000 [#1] PREEMPT SMP NOPTI
7251   │ 4,4912,12879174339,-;CPU: 2 PID: 8286 Comm: modprobe Tainted: P           O    T  6.10.0-rc4-secint+ #5
7252   │ 4,4913,12879174806,-;Hardware name: Supermicro Super Server/H11DSU-iN, BIOS [REDACTED]
7253   │ 4,4914,12879175278,-;RIP: 0010:_raw_spin_lock_irqsave+0x2c/0x80
7254   │ 4,4915,12879175755,-;Code: 44 00 00 55 48 89 e5 41 54 53 48 89 fb 9c 58 0f 1f 40 00 49 89 c4 fa 0f 1f 44 00 00 65 ff 05 33 22 0b 43 31 c0 ba 01 00 00 00 <f0> 0f b1 13 75 20 4c 89 e0 5b 41 5c 5d 31 d2 31 c9 31 f6 31 ff 45
7255   │ 4,4916,12879176269,-;RSP: 0018:ffffb70598833b30 EFLAGS: 00010046
7256   │ 4,4917,12879176791,-;RAX: 0000000000000000 RBX: d8e1a48104ea265c RCX: 0000000000000000
7257   │ 4,4918,12879177326,-;RDX: 0000000000000001 RSI: 0000000000000001 RDI: d8e1a48104ea265c
7258   │ 4,4919,12879177868,-;RBP: ffffb70598833b40 R08: 0000000000000000 R09: 0000000000000000
7259   │ 4,4920,12879178408,-;R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000086
7260   │ 4,4921,12879178950,-;R13: d8e1a48104ea265c R14: 0000000000000001 R15: 0000000000000000
7261   │ 4,4922,12879179496,-;FS:  0000000000000000(0000) GS:ffff8db3eec00000(0000) knlGS:0000000000000000
7262   │ 4,4923,12879180052,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7263   │ 4,4924,12879180611,-;CR2: 0000797bd36b3ccc CR3: 00008020dee96000 CR4: 00000000003506f0
7264   │ 4,4925,12879181175,-;Call Trace:
7265   │ 4,4926,12879181739,-; <TASK>
7266   │ 4,4927,12879182303,-; ? show_regs+0x6c/0x80
7267   │ 4,4928,12879182871,-; ? die_addr+0x37/0xa0
7268   │ 4,4929,12879183437,-; ? exc_general_protection+0x1d2/0x400
7269   │ 4,4930,12879184016,-; ? asm_exc_general_protection+0x27/0x30
7270   │ 4,4931,12879184592,-; ? _raw_spin_lock_irqsave+0x2c/0x80
7271   │ 4,4932,12879185168,-; do_send_sig_info+0x3b/0xc0
7272   │ 4,4933,12879185733,-; ? security_bprm_committing_creds+0x1/0x50
7273   │ 4,4934,12879186280,-; send_sig_info+0x19/0x40
7274   │ 4,4935,12879186804,-; p_set_ed_process_off+0x173/0x350 [lkrg]
7275   │ 4,4936,12879187342,-; p_security_bprm_committing_creds_entry+0x7c/0xc0 [lkrg]
7276   │ 4,4937,12879187872,-; pre_handler_kretprobe+0x3f/0xa0
7277   │ 4,4938,12879188409,-; kprobe_ftrace_handler+0x157/0x1f0
7278   │ 4,4939,12879188932,-; ? security_bprm_committing_creds+0x5/0x50
7279   │ 4,4940,12879189440,-; 0xffffffffc21910f5
7280   │ 4,4941,12879189969,-;RIP: 0010:security_bprm_committing_creds+0x1/0x50
7281   │ 4,4942,12879190535,-;Code: 48 85 c0 75 df 31 c0 5b 41 5c 5d 31 ff e9 27 bf ae 00 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 <8b> 78 cf 05 55 48 89 e5 41 54 53 48 8b 05 8d c8 43 01 48 85 c0 74
7282   │ 4,4943,12879191110,-;RSP: 0018:ffffb70598833d40 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
7283   │ 4,4944,12879191676,-;RAX: 0000000000000000 RBX: ffff8d9478dfd400 RCX: 0000000000000000
7284   │ 4,4945,12879192257,-;RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8d9478e01000
7285   │ 4,4946,12879192864,-;RBP: ffffb70598833d88 R08: 0000000000000000 R09: 0000000000000000
7286   │ 4,4947,12879193453,-;R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d9478e01000
7287   │ 4,4948,12879194019,-;R13: ffff8d9479dae026 R14: 0000000000000000 R15: 0000000000000000
7288   │ 4,4949,12879194589,-; ? security_bprm_committing_creds+0x5/0x50
7289   │ 4,4950,12879195205,-; ? begin_new_exec+0x6c5/0xbb0
7290   │ 4,4951,12879195825,-; ? security_bprm_committing_creds+0x5/0x50
7291   │ 4,4952,12879196388,-; ? begin_new_exec+0x6c5/0xbb0
7292   │ 4,4953,12879196948,-; load_elf_binary+0x326/0x1640
7293   │ 4,4954,12879197561,-; bprm_execve+0x24c/0x670
7294   │ 4,4955,12879198184,-; kernel_execve+0x149/0x1b0
7295   │ 4,4956,12879198761,-; call_usermodehelper_exec_async+0xd6/0x190
7296   │ 4,4957,12879199333,-; ? __pfx_call_usermodehelper_exec_async+0x10/0x10
7297   │ 4,4958,12879199912,-; osnoise_arch_unregister+0x220/0x220
7298   │ 4,4959,12879200506,-; ? __pfx_call_usermodehelper_exec_async+0x10/0x10
7299   │ 4,4960,12879201098,-; ret_from_fork_asm+0x1a/0x30
7300   │ 4,4961,12879201675,-; </TASK>
7301   │ 4,4962,12879202241,-;Modules linked in: nf_tables nfnetlink iptable_filter iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables vhost_net vhost vhost_iotlb tap intel_rapl_msr intel_rapl
       │ _common kvm_amd kvm crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl efi_pstore pcspkr input_leds ast i2c_algo_bit acpi_ipmi ccp k10temp 
       │ ipmi_si ipmi_devintf ipmi_msghandler mac_hid btrfs blake2b_generic xor raid6_pq hid_generic usbmouse usbkbd usbhid uas hid usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c mpt3sas xhci_pci xhci_pci_r
       │ enesas nvme raid_class ahci crc32_pclmul scsi_transport_sas bnx2 xhci_hcd libahci nvme_core i2c_piix4 nvme_auth [last unloaded: spl(O)]
7302   │ 0,4963,12879204897,-;Dumping ftrace buffer:
7303   │ 0,4964,12879205581,-;   (ftrace buffer empty)
7304   │ 4,4965,12879206256,-;---[ end trace 0000000000000000 ]---
7305   │ 4,4966,12879302252,-;RIP: 0010:_raw_spin_lock_irqsave+0x2c/0x80
7306   │ 4,4967,12879303005,-;Code: 44 00 00 55 48 89 e5 41 54 53 48 89 fb 9c 58 0f 1f 40 00 49 89 c4 fa 0f 1f 44 00 00 65 ff 05 33 22 0b 43 31 c0 ba 01 00 00 00 <f0> 0f b1 13 75 20 4c 89 e0 5b 41 5c 5d 31 d2 31 c9 31 f6 31 ff 45
7307   │ 4,4968,12879303749,-;RSP: 0018:ffffb70598833b30 EFLAGS: 00010046
7308   │ 4,4969,12879304481,-;RAX: 0000000000000000 RBX: d8e1a48104ea265c RCX: 0000000000000000
7309   │ 4,4970,12879305213,-;RDX: 0000000000000001 RSI: 0000000000000001 RDI: d8e1a48104ea265c
7310   │ 4,4971,12879305950,-;RBP: ffffb70598833b40 R08: 0000000000000000 R09: 0000000000000000
7311   │ 4,4972,12879306687,-;R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000086
7312   │ 4,4973,12879307430,-;R13: d8e1a48104ea265c R14: 0000000000000001 R15: 0000000000000000
7313   │ 4,4974,12879308171,-;FS:  0000000000000000(0000) GS:ffff8db3eec00000(0000) knlGS:0000000000000000
7314   │ 4,4975,12879308920,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7315   │ 4,4976,12879309668,-;CR2: 0000797bd36b3ccc CR3: 00008020dee96000 CR4: 00000000003506f0
7316   │ 6,4977,12879310428,-;note: modprobe[8286] exited with irqs disabled
7317   │ 6,4978,12879311264,-;note: modprobe[8286] exited with preempt_count 3
7318   │ 2,4979,12879345699,-;LKRG: ALERT: BLOCK: Module: Loading of module name spl
7319   │ 2,4980,12879791635,-;LKRG: ALERT: BLOCK: Module: Loading of module name tls
7320   │ 2,4981,12889073689,-;LKRG: ALERT: BLOCK: Module: Loading of module name spl
7321   │ 2,4982,12889153458,-;LKRG: ALERT: BLOCK: Module: Loading of module name spl
7322   │ 2,4983,12889234070,-;LKRG: ALERT: BLOCK: Module: Loading of module name spl
7323   │ 2,4984,12889543041,-;LKRG: ALERT: BLOCK: Module: Loading of module name tls
7324   │ 0,4985,12915286555,-;watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/u77:3:8151]
7325   │ 4,4986,12915288121,-;Modules linked in: nf_tables nfnetlink iptable_filter iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables vhost_net vhost vhost_iotlb tap intel_rapl_msr intel_rapl
       │ _common kvm_amd kvm crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl efi_pstore pcspkr input_leds ast i2c_algo_bit acpi_ipmi ccp k10temp 
       │ ipmi_si ipmi_devintf ipmi_msghandler mac_hid btrfs blake2b_generic xor raid6_pq hid_generic usbmouse usbkbd usbhid uas hid usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c mpt3sas xhci_pci xhci_pci_r
       │ enesas nvme raid_class ahci crc32_pclmul scsi_transport_sas bnx2 xhci_hcd libahci nvme_core i2c_piix4 nvme_auth [last unloaded: spl(O)]
7326   │ 4,4987,12915292056,-;CPU: 12 PID: 8151 Comm: kworker/u77:3 Tainted: P      D    O    T  6.10.0-rc4-secint+ #5
7327   │ 4,4988,12915292958,-;Hardware name: Supermicro Super Server/H11DSU-iN, BIOS [REDACTED]
7328   │ 4,4989,12915293867,-;Workqueue: events_unbound p_check_integrity [lkrg]
7329   │ 4,4990,12915294744,-;RIP: 0010:queued_read_lock_slowpath+0x55/0x150
7330   │ 4,4991,12915295604,-;Code: 00 4c 8d 63 04 31 c0 ba 01 00 00 00 f0 0f b1 53 04 0f 85 fe 00 00 00 f0 81 03 00 02 00 00 8b 03 84 c0 74 08 f3 90 8b 03 84 c0 <75> f8 4c 89 e7 c6 07 00 0f 1f 00 66 90 5b 41 5c 5d 31 c0 31 d2 31
7331   │ 4,4992,12915296474,-;RSP: 0018:ffffb7059862bd58 EFLAGS: 00000286
7332   │ 4,4993,12915297389,-;RAX: 00000000000002ff RBX: ffffffffc2239220 RCX: 0000000000000000
7333   │ 4,4994,12915298284,-;RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc2239220
7334   │ 4,4995,12915299176,-;RBP: ffffb7059862bd68 R08: 0000000000000000 R09: 0000000000000000
7335   │ 4,4996,12915300073,-;R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc2239224
7336   │ 4,4997,12915300915,-;R13: ffff8d94a0aaa298 R14: ffff8d94a0aa8000 R15: 0000000000000001
7337   │ 4,4998,12915301749,-;FS:  0000000000000000(0000) GS:ffff8db3ee200000(0000) knlGS:0000000000000000
7338   │ 4,4999,12915302592,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7339   │ 4,5000,12915303427,-;CR2: 000055abe2000f60 CR3: 000080203eb1a000 CR4: 00000000003506f0
7340   │ 4,5001,12915304252,-;Call Trace:
7341   │ 4,5002,12915305082,-; <IRQ>
7342   │ 4,5003,12915305892,-; ? show_regs+0x6c/0x80
7343   │ 4,5004,12915306704,-; ? watchdog_timer_fn+0x218/0x2a0
7344   │ 4,5005,12915307506,-; ? __pfx_watchdog_timer_fn+0x10/0x10
7345   │ 4,5006,12915308301,-; ? __hrtimer_run_queues+0x108/0x280
7346   │ 4,5007,12915309105,-; ? clockevents_program_event+0xba/0x150
7347   │ 4,5008,12915309911,-; ? hrtimer_interrupt+0xf8/0x250
7348   │ 4,5009,12915310715,-; ? __sysvec_apic_timer_interrupt+0x59/0x150
7349   │ 4,5010,12915311519,-; ? sysvec_apic_timer_interrupt+0x9b/0xc0
7350   │ 4,5011,12915312290,-; </IRQ>
7351   │ 4,5012,12915313027,-; <TASK>
7352   │ 4,5013,12915313756,-; ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
7353   │ 4,5014,12915314503,-; ? queued_read_lock_slowpath+0x55/0x150
7354   │ 4,5015,12915315240,-; _raw_read_lock+0x2f/0x40
7355   │ 4,5016,12915315971,-; p_ed_enforce_validation_paranoid+0xba/0x3a0 [lkrg]
7356   │ 4,5017,12915316727,-; p_check_integrity+0x432/0x1a10 [lkrg]
7357   │ 4,5018,12915317473,-; process_one_work+0x184/0x3e0
7358   │ 4,5019,12915318211,-; worker_thread+0x2e4/0x410
7359   │ 4,5020,12915318932,-; ? __pfx_worker_thread+0x10/0x10
7360   │ 4,5021,12915319658,-; kthread+0xe7/0x120
7361   │ 4,5022,12915320384,-; ? __pfx_kthread+0x10/0x10
7362   │ 4,5023,12915321115,-; ret_from_fork+0x47/0x70
7363   │ 4,5024,12915321837,-; ? __pfx_kthread+0x10/0x10
7364   │ 4,5025,12915322556,-; ret_from_fork_asm+0x1a/0x30
7365   │ 4,5026,12915323279,-; </TASK>
7366   │ 0,5027,12935280555,-;watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [kworker/u77:0:967]
7367   │ 4,5028,12935282294,-;Modules linked in: nf_tables nfnetlink iptable_filter iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables vhost_net vhost vhost_iotlb tap intel_rapl_msr intel_rapl
       │ _common kvm_amd kvm crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl efi_pstore pcspkr input_leds ast i2c_algo_bit acpi_ipmi ccp k10temp 
       │ ipmi_si ipmi_devintf ipmi_msghandler mac_hid btrfs blake2b_generic xor raid6_pq hid_generic usbmouse usbkbd usbhid uas hid usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c mpt3sas xhci_pci xhci_pci_r
       │ enesas nvme raid_class ahci crc32_pclmul scsi_transport_sas bnx2 xhci_hcd libahci nvme_core i2c_piix4 nvme_auth [last unloaded: spl(O)]
7368   │ 4,5029,12935286462,-;CPU: 8 PID: 967 Comm: kworker/u77:0 Tainted: P      D    O L  T  6.10.0-rc4-secint+ #5
7369   │ 4,5030,12935287347,-;Hardware name: Supermicro Super Server/H11DSU-iN, BIOS [REDACTED]
7370   │ 4,5031,12935288200,-;Workqueue: events_unbound p_check_integrity [lkrg]
7371   │ 4,5032,12935289066,-;RIP: 0010:native_queued_spin_lock_slowpath+0x79/0x2d0
7372   │ 4,5033,12935289923,-;Code: 00 00 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 66 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 89 03 5b 41 5c 41 5d 41 5e
7373   │ 4,5034,12935290808,-;RSP: 0018:ffffb70598583d28 EFLAGS: 00000202
7374   │ 4,5035,12935291693,-;RAX: 0000000000000001 RBX: ffffffffc2239224 RCX: 0000000000000000
7375   │ 4,5036,12935292537,-;RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffc2239224
7376   │ 4,5037,12935293357,-;RBP: ffffb70598583d48 R08: 0000000000000000 R09: 0000000000000000
7377   │ 4,5038,12935294173,-;R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc2239224
7378   │ 4,5039,12935294980,-;R13: ffff8d94a0aaa298 R14: ffff8d94a0aa8000 R15: 0000000000000001
7379   │ 4,5040,12935295778,-;FS:  0000000000000000(0000) GS:ffff8db3ee600000(0000) knlGS:0000000000000000
7380   │ 4,5041,12935296585,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7381   │ 4,5042,12935297393,-;CR2: 0000561ac430b9c0 CR3: 000080203eb1a000 CR4: 00000000003506f0
7382   │ 4,5043,12935298199,-;Call Trace:
7383   │ 4,5044,12935298994,-; <IRQ>
7384   │ 4,5045,12935299776,-; ? show_regs+0x6c/0x80
7385   │ 4,5046,12935300562,-; ? watchdog_timer_fn+0x218/0x2a0
7386   │ 4,5047,12935301350,-; ? __pfx_watchdog_timer_fn+0x10/0x10
7387   │ 4,5048,12935302125,-; ? __hrtimer_run_queues+0x108/0x280
7388   │ 4,5049,12935302904,-; ? clockevents_program_event+0xba/0x150
7389   │ 4,5050,12935303683,-; ? hrtimer_interrupt+0xf8/0x250
7390   │ 4,5051,12935304458,-; ? __sysvec_apic_timer_interrupt+0x59/0x150
7391   │ 4,5052,12935305248,-; ? sysvec_apic_timer_interrupt+0x9b/0xc0
7392   │ 4,5053,12935306032,-; </IRQ>
7393   │ 4,5054,12935306804,-; <TASK>
7394   │ 4,5055,12935307545,-; ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
7395   │ 4,5056,12935308280,-; ? native_queued_spin_lock_slowpath+0x79/0x2d0
7396   │ 4,5057,12935308984,-; queued_read_lock_slowpath+0x14a/0x150
7397   │ 4,5058,12935309674,-; _raw_read_lock+0x2f/0x40
7398   │ 4,5059,12935310349,-; p_ed_enforce_validation_paranoid+0xba/0x3a0 [lkrg]
7399   │ 4,5060,12935311033,-; p_check_integrity+0x432/0x1a10 [lkrg]
7400   │ 4,5061,12935311708,-; process_one_work+0x184/0x3e0
7401   │ 4,5062,12935312354,-; worker_thread+0x2e4/0x410
7402   │ 4,5063,12935312981,-; ? __pfx_worker_thread+0x10/0x10
7403   │ 4,5064,12935313585,-; kthread+0xe7/0x120
7404   │ 4,5065,12935314187,-; ? __pfx_kthread+0x10/0x10
7405   │ 4,5066,12935314785,-; ret_from_fork+0x47/0x70
7406   │ 4,5067,12935315382,-; ? __pfx_kthread+0x10/0x10
7407   │ 4,5068,12935315973,-; ret_from_fork_asm+0x1a/0x30
7408   │ 4,5069,12935316565,-; </TASK>
7409   │ 0,5070,12943286553,-;watchdog: BUG: soft lockup - CPU#12 stuck for 49s! [kworker/u77:3:8151]
7410   │ 4,5071,12943288190,-;Modules linked in: nf_tables nfnetlink iptable_filter iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables vhost_net vhost vhost_iotlb tap intel_rapl_msr intel_rapl
       │ _common kvm_amd kvm crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl efi_pstore pcspkr input_leds ast i2c_algo_bit acpi_ipmi ccp k10temp 
       │ ipmi_si ipmi_devintf ipmi_msghandler mac_hid btrfs blake2b_generic xor raid6_pq hid_generic usbmouse usbkbd usbhid uas hid usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c mpt3sas xhci_pci xhci_pci_r
       │ enesas nvme raid_class ahci crc32_pclmul scsi_transport_sas bnx2 xhci_hcd libahci nvme_core i2c_piix4 nvme_auth [last unloaded: spl(O)]
7411   │ 4,5072,12943291797,-;CPU: 12 PID: 8151 Comm: kworker/u77:3 Tainted: P      D    O L  T  6.10.0-rc4-secint+ #5
7412   │ 4,5073,12943292648,-;Hardware name: Supermicro Super Server/H11DSU-iN, BIOS [REDACTED]
7413   │ 4,5074,12943293489,-;Workqueue: events_unbound p_check_integrity [lkrg]
7414   │ 4,5075,12943294349,-;RIP: 0010:queued_read_lock_slowpath+0x51/0x150
7415   │ 4,5076,12943295194,-;Code: 0f 1f 44 00 00 4c 8d 63 04 31 c0 ba 01 00 00 00 f0 0f b1 53 04 0f 85 fe 00 00 00 f0 81 03 00 02 00 00 8b 03 84 c0 74 08 f3 90 <8b> 03 84 c0 75 f8 4c 89 e7 c6 07 00 0f 1f 00 66 90 5b 41 5c 5d 31
7416   │ 4,5077,12943296059,-;RSP: 0018:ffffb7059862bd58 EFLAGS: 00000286
7417   │ 4,5078,12943296926,-;RAX: 00000000000002ff RBX: ffffffffc2239220 RCX: 0000000000000000
7418   │ 4,5079,12943297787,-;RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc2239220
7419   │ 4,5080,12943298646,-;RBP: ffffb7059862bd68 R08: 0000000000000000 R09: 0000000000000000
7420   │ 4,5081,12943299520,-;R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc2239224
7421   │ 4,5082,12943300391,-;R13: ffff8d94a0aaa298 R14: ffff8d94a0aa8000 R15: 0000000000000001
7422   │ 4,5083,12943301237,-;FS:  0000000000000000(0000) GS:ffff8db3ee200000(0000) knlGS:0000000000000000
7423   │ 4,5084,12943302114,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7424   │ 4,5085,12943302981,-;CR2: 000055abe2000f60 CR3: 000080203eb1a000 CR4: 00000000003506f0
7425   │ 4,5086,12943303845,-;Call Trace:
7426   │ 4,5087,12943304709,-; <IRQ>
7427   │ 4,5088,12943305567,-; ? show_regs+0x6c/0x80
7428   │ 4,5089,12943306432,-; ? watchdog_timer_fn+0x218/0x2a0
7429   │ 4,5090,12943307303,-; ? __pfx_watchdog_timer_fn+0x10/0x10
7430   │ 4,5091,12943308161,-; ? __hrtimer_run_queues+0x108/0x280
7431   │ 4,5092,12943309009,-; ? clockevents_program_event+0xba/0x150
7432   │ 4,5093,12943309858,-; ? hrtimer_interrupt+0xf8/0x250
7433   │ 4,5094,12943310705,-; ? __sysvec_apic_timer_interrupt+0x59/0x150
7434   │ 4,5095,12943311564,-; ? sysvec_apic_timer_interrupt+0x9b/0xc0
7435   │ 4,5096,12943312409,-; </IRQ>
7436   │ 4,5097,12943313168,-; <TASK>
7437   │ 4,5098,12943313896,-; ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
7438   │ 4,5099,12943314612,-; ? queued_read_lock_slowpath+0x51/0x150
7439   │ 4,5100,12943315300,-; _raw_read_lock+0x2f/0x40
7440   │ 4,5101,12943315983,-; p_ed_enforce_validation_paranoid+0xba/0x3a0 [lkrg]
7441   │ 4,5102,12943316679,-; p_check_integrity+0x432/0x1a10 [lkrg]
7442   │ 4,5103,12943317370,-; process_one_work+0x184/0x3e0
7443   │ 4,5104,12943318032,-; worker_thread+0x2e4/0x410
7444   │ 4,5105,12943318664,-; ? __pfx_worker_thread+0x10/0x10
7445   │ 4,5106,12943319270,-; kthread+0xe7/0x120
7446   │ 4,5107,12943319872,-; ? __pfx_kthread+0x10/0x10
7447   │ 4,5108,12943320468,-; ret_from_fork+0x47/0x70
7448   │ 4,5109,12943321061,-; ? __pfx_kthread+0x10/0x10
7449   │ 4,5110,12943321646,-; ret_from_fork_asm+0x1a/0x30
7450   │ 4,5111,12943322232,-; </TASK>
7451   │ 0,5112,12947289554,-;watchdog: BUG: soft lockup - CPU#14 stuck for 23s! [kworker/u77:2:7105]
7452   │ 4,5113,12947291441,-;Modules linked in: nf_tables nfnetlink iptable_filter iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables vhost_net vhost vhost_iotlb tap intel_rapl_msr intel_rapl
       │ _common kvm_amd kvm crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd ipmi_ssif rapl efi_pstore pcspkr input_leds ast i2c_algo_bit acpi_ipmi ccp k10temp 
       │ ipmi_si ipmi_devintf ipmi_msghandler mac_hid btrfs blake2b_generic xor raid6_pq hid_generic usbmouse usbkbd usbhid uas hid usb_storage dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c mpt3sas xhci_pci xhci_pci_r
       │ enesas nvme raid_class ahci crc32_pclmul scsi_transport_sas bnx2 xhci_hcd libahci nvme_core i2c_piix4 nvme_auth [last unloaded: spl(O)]
7453   │ 4,5114,12947295484,-;CPU: 14 PID: 7105 Comm: kworker/u77:2 Tainted: P      D    O L  T  6.10.0-rc4-secint+ #5
7454   │ 4,5115,12947296398,-;Hardware name: Supermicro Super Server/H11DSU-iN, BIOS [REDACTED]
7455   │ 4,5116,12947297309,-;Workqueue: events_unbound p_check_integrity [lkrg]
7456   │ 4,5117,12947298243,-;RIP: 0010:native_queued_spin_lock_slowpath+0x224/0x2d0
7457   │ 4,5118,12947299149,-;Code: c6 01 41 c1 e5 10 41 c1 e6 12 45 09 f5 44 89 e8 c1 e8 10 66 87 43 02 89 c1 c1 e1 10 81 f9 ff ff 00 00 77 37 31 c9 eb 02 f3 90 <8b> 03 66 85 c0 75 f7 89 c6 66 31 f6 44 39 ee 74 78 c6 03 01 48 85
7458   │ 4,5119,12947300080,-;RSP: 0018:ffffb7059889bd28 EFLAGS: 00000202
7459   │ 4,5120,12947300994,-;RAX: 0000000000400101 RBX: ffffffffc2239224 RCX: 0000000000000000
7460   │ 4,5121,12947301919,-;RDX: ffff8db3ee037880 RSI: 0000000000000101 RDI: ffffffffc2239224
7461   │ 4,5122,12947302817,-;RBP: ffffb7059889bd48 R08: 0000000000000000 R09: 0000000000000000
7462   │ 4,5123,12947303747,-;R10: 0000000000000000 R11: 0000000000000000 R12: ffff8db3ee037880
7463   │ 4,5124,12947304667,-;R13: 00000000003c0000 R14: 00000000003c0000 R15: 0000000000000001
7464   │ 4,5125,12947305581,-;FS:  0000000000000000(0000) GS:ffff8db3ee000000(0000) knlGS:0000000000000000
7465   │ 4,5126,12947306496,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7466   │ 4,5127,12947307431,-;CR2: 00007f643dd82110 CR3: 000080203eb1a000 CR4: 00000000003506f0
7467   │ 4,5128,12947308356,-;Call Trace:
7468   │ 4,5129,12947309295,-; <IRQ>
7469   │ 4,5130,12947310208,-; ? show_regs+0x6c/0x80
7470   │ 4,5131,12947311147,-; ? watchdog_timer_fn+0x218/0x2a0
7471   │ 4,5132,12947312076,-; ? __pfx_watchdog_timer_fn+0x10/0x10
7472   │ 4,5133,12947312997,-; ? __hrtimer_run_queues+0x108/0x280
7473   │ 4,5134,12947313906,-; ? clockevents_program_event+0xba/0x150
7474   │ 4,5135,12947314808,-; ? hrtimer_interrupt+0xf8/0x250
7475   │ 4,5136,12947315712,-; ? __sysvec_apic_timer_interrupt+0x59/0x150
7476   │ 4,5137,12947316619,-; ? sysvec_apic_timer_interrupt+0x9b/0xc0
7477   │ 4,5138,12947317510,-; </IRQ>
7478   │ 4,5139,12947318382,-; <TASK>
7479   │ 4,5140,12947319153,-; ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
7480   │ 4,5141,12947319912,-; ? native_queued_spin_lock_slowpath+0x224/0x2d0
7481   │ 4,5142,12947320646,-; queued_read_lock_slowpath+0x14a/0x150
7482   │ 4,5143,12947321359,-; _raw_read_lock+0x2f/0x40
7483   │ 4,5144,12947322058,-; p_ed_enforce_validation_paranoid+0xba/0x3a0 [lkrg]
7484   │ 4,5145,12947322754,-; p_check_integrity+0x432/0x1a10 [lkrg]
7485   │ 4,5146,12947323436,-; process_one_work+0x184/0x3e0
7486   │ 4,5147,12947324090,-; worker_thread+0x2e4/0x410
7487   │ 4,5148,12947324723,-; ? __pfx_worker_thread+0x10/0x10
7488   │ 4,5149,12947325329,-; kthread+0xe7/0x120
7489   │ 4,5150,12947325937,-; ? __pfx_kthread+0x10/0x10
7490   │ 4,5151,12947326526,-; ret_from_fork+0x47/0x70
7491   │ 4,5152,12947327115,-; ? __pfx_kthread+0x10/0x10
7492   │ 4,5153,12947327697,-; ret_from_fork_asm+0x1a/0x30
7493   │ 4,5154,12947328285,-; </TASK>
7494   │ 3,5155,12954233794,-;rcu: INFO: rcu_preempt self-detected stall on CPU
7495   │ 3,5156,12954235580,-;rcu: \x0912-....: (59926 ticks this GP) idle=5954/1/0x4000000000000000 softirq=73774/73774 fqs=14882
7496   │ 3,5157,12954236587,-;rcu: \x09         hardirqs   softirqs   csw/system
7497   │ 3,5158,12954237471,-;rcu: \x09 number:        0        679            0
7498   │ 3,5159,12954238312,-;rcu: \x09cputime:        0          0        29962   ==> 29998(ms)
7499   │ 3,5160,12954239122,-;rcu: \x09(t=60005 jiffies g=384821 q=1901 ncpus=16)
7500   │ 4,5161,12954239917,-;CPU: 12 PID: 8151 Comm: kworker/u77:3 Tainted: P      D    O L  T  6.10.0-rc4-secint+ #5
7501   │ 4,5162,12954240715,-;Hardware name: Supermicro Super Server/H11DSU-iN, BIOS [REDACTED]
7502   │ 4,5163,12954241491,-;Workqueue: events_unbound p_check_integrity [lkrg]
7503   │ 4,5164,12954242274,-;RIP: 0010:queued_read_lock_slowpath+0x55/0x150
7504   │ 4,5165,12954243038,-;Code: 00 4c 8d 63 04 31 c0 ba 01 00 00 00 f0 0f b1 53 04 0f 85 fe 00 00 00 f0 81 03 00 02 00 00 8b 03 84 c0 74 08 f3 90 8b 03 84 c0 <75> f8 4c 89 e7 c6 07 00 0f 1f 00 66 90 5b 41 5c 5d 31 c0 31 d2 31
7505   │ 4,5166,12954243850,-;RSP: 0018:ffffb7059862bd58 EFLAGS: 00000286
7506   │ 4,5167,12954244639,-;RAX: 00000000000002ff RBX: ffffffffc2239220 RCX: 0000000000000000
7507   │ 4,5168,12954245442,-;RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc2239220
7508   │ 4,5169,12954246245,-;RBP: ffffb7059862bd68 R08: 0000000000000000 R09: 0000000000000000
7509   │ 4,5170,12954247056,-;R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc2239224
7510   │ 4,5171,12954247844,-;R13: ffff8d94a0aaa298 R14: ffff8d94a0aa8000 R15: 0000000000000001
7511   │ 4,5172,12954248652,-;FS:  0000000000000000(0000) GS:ffff8db3ee200000(0000) knlGS:0000000000000000
7512   │ 4,5173,12954249467,-;CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
7513   │ 4,5174,12954250289,-;CR2: 000055abe2000f60 CR3: 000080203eb1a000 CR4: 00000000003506f0
7514   │ 4,5175,12954251101,-;Call Trace:
7515   │ 4,5176,12954251908,-; <IRQ>
7516   │ 4,5177,12954252729,-; ? show_regs+0x6c/0x80
7517   │ 4,5178,12954253576,-; ? dump_cpu_task+0x6f/0x80
7518   │ 4,5179,12954254400,-; ? rcu_dump_cpu_stacks+0xd7/0x120
7519   │ 4,5180,12954255204,-; ? rcu_sched_clock_irq+0x52d/0x11d0
7520   │ 4,5181,12954255996,-; ? tmigr_requires_handle_remote+0x8c/0x160
7521   │ 4,5182,12954256799,-; ? update_process_times+0x70/0xc0
7522   │ 4,5183,12954257595,-; ? tick_nohz_handler+0x97/0x170
7523   │ 4,5184,12954258372,-; ? __pfx_tick_nohz_handler+0x10/0x10
7524   │ 4,5185,12954259053,-; ? __hrtimer_run_queues+0x108/0x280
7525   │ 4,5186,12954259709,-; ? clockevents_program_event+0xba/0x150
7526   │ 4,5187,12954260344,-; ? hrtimer_interrupt+0xf8/0x250
7527   │ 4,5188,12954260966,-; ? __sysvec_apic_timer_interrupt+0x59/0x150
7528   │ 4,5189,12954261581,-; ? sysvec_apic_timer_interrupt+0x9b/0xc0
7529   │ 4,5190,12954262186,-; </IRQ>
7530   │ 4,5191,12954262781,-; <TASK>
7531   │ 4,5192,12954263362,-; ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
7532   │ 4,5193,12954263938,-; ? queued_read_lock_slowpath+0x55/0x150
7533   │ 4,5194,12954264473,-; _raw_read_lock+0x2f/0x40
7534   │ 4,5195,12954265001,-; p_ed_enforce_validation_paranoid+0xba/0x3a0 [lkrg]
7535   │ 4,5196,12954265548,-; p_check_integrity+0x432/0x1a10 [lkrg]
7536   │ 4,5197,12954266086,-; process_one_work+0x184/0x3e0
7537   │ 4,5198,12954266615,-; worker_thread+0x2e4/0x410
7538   │ 4,5199,12954267143,-; ? __pfx_worker_thread+0x10/0x10
7539   │ 4,5200,12954267665,-; kthread+0xe7/0x120
7540   │ 4,5201,12954268171,-; ? __pfx_kthread+0x10/0x10
7541   │ 4,5202,12954268670,-; ret_from_fork+0x47/0x70
7542   │ 4,5203,12954269170,-; ? __pfx_kthread+0x10/0x10
7543   │ 4,5204,12954269665,-; ret_from_fork_asm+0x1a/0x30
7544   │ 4,5205,12954270170,-; </TASK>

@solardiz, if I understand right, we call current macro, which is expand to per-cpu tls variable extractor. Later, on line 92 we repeat this operation. But, as was noticed in the comments above, running core could be swapped because of disabled IRQ state. So, second time the current macro could return us unexpected value.

Anyway, atm it running totally free from any errors on this dirty (and built with - DP_LKRG_TASK_OFF_DEBUG=1) version:

pve:/usr/src/lkrg/src# cat 00001-respect-irq-dirty.patch 
diff --git a/src/modules/exploit_detection/syscalls/exec/p_security_bprm_committed_creds/p_security_bprm_committed_creds.c b/src/modules/exploit_detection/syscalls/exec/p_security_bprm_committed_creds/p_security_bprm_committed_creds.c
index a3423eb..311ace5 100644
--- a/src/modules/exploit_detection/syscalls/exec/p_security_bprm_committed_creds/p_security_bprm_committed_creds.c
+++ b/src/modules/exploit_detection/syscalls/exec/p_security_bprm_committed_creds/p_security_bprm_committed_creds.c
@@ -42,10 +42,6 @@ notrace struct inode *p_get_inode_from_task(struct task_struct *p_arg) {
    struct mm_struct *p_mm;
    struct inode *p_inode = NULL;

-   if (!p_arg) {
-      return NULL;
-   }
-
    /*
     * This function is called from the context of newly created
     * Process which is intercepted by our *probes. This means
@@ -59,37 +55,40 @@ notrace struct inode *p_get_inode_from_task(struct task_struct *p_arg) {
     * to sleep.
     * Current implementation works well!
     */
-//   down_read(&p_arg->mm->mmap_sem);
+   //mmap_read_lock(p_arg->mm);

    p_mm = p_arg->mm;
    if (p_mm->exe_file) {
       p_inode = p_mm->exe_file->f_inode;
    }

-//   up_read(&p_arg->mm->mmap_sem);
+   //mmap_read_unlock(p_arg->mm);

    return p_inode;
 }

 LKRG_DEBUG_TRACE int p_security_bprm_committed_creds_ret(struct kretprobe_instance *ri, struct pt_regs *p_regs) {

-//   struct inode *p_inode;
+   struct inode *p_inode;
    struct p_ed_process *p_tmp;
    unsigned long p_flags;
+   struct task_struct *curr = current;

-/*
-   p_inode = p_get_inode_from_task(current);
+   if (unlikely(!curr))
+      return 0;
+
+   //enable_irq();
+   p_inode = p_get_inode_from_task(curr);

    p_debug_kprobe_log(
           "p_search_binary_handler_ret: returned value => %ld comm[%s] Pid:%d inode[%ld]",
            p_regs_get_ret(p_regs),current->comm,current->pid,p_inode->i_ino);
-*/

    // Update process
    p_tasks_write_lock(&p_flags);
-   if ( (p_tmp = p_find_ed_by_pid(task_pid_nr(current))) != NULL) {
+   if ( (p_tmp = p_find_ed_by_pid(task_pid_nr(curr))) != NULL) {
       // This process is on the ED list - update information!
-      p_update_ed_process(p_tmp, current, 1);
+      p_update_ed_process(p_tmp, curr, 1);
 #ifdef P_LKRG_TASK_OFF_DEBUG
       p_debug_off_flag_reset(p_tmp, 40);
 #endif
@@ -97,6 +96,7 @@ LKRG_DEBUG_TRACE int p_security_bprm_committed_creds_ret(struct kretprobe_instan
    }
    p_tasks_write_unlock(&p_flags);

+   //disable_irq_nosync();
 //   p_ed_enforce_validation();

    return 0;

but I think the thing is in the current macro.

Thank you, for great open-source software!

m1lua commented 3 months ago

Hmm, my apologizes, just found that system Oopsed again. It it strange a bit, before it trigger the bug quite fast, ~30 min. Last time it take >24 hours to reproduce, under the load.

Will try to figure out more.


try to add spinlock_irqsave + lock curr->mm and launch while true; do exec sh -c exit & done

[ 4103.670400] LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 439, name sh
[ 4103.671411] LKRG: ALERT: BLOCK: Task: Killing pid 0, name 

But without any Oops. hmm

m1lua commented 3 months ago

Seems just figure it out. Will add another PR.