Open blshkv opened 5 years ago
I looked at the source code and feel that parsing syslogs is not the best idea. The better way would be to expose the current status via procfs (/sys/module/tyton?) and wait for the event using inotify or similar.
@blshkv Thanks for the suggestion! I wholeheartedly agree. I actually saw this done recently in another kernel-related project and thought it was an interesting idea.
I should have time to implement this in a few days. If you would like it done faster I'd love to accept a PR :)
@blshkv If you want, you can try out the new branch here: https://github.com/nbulischeck/tyton/tree/procfs-status
This creates a file at /proc/tyton and will report back the current analysis when you read it.
The notification daemon isn't updated yet, however.
Hi, I have tried that version (kernel.osrelease = 4.20.6-pentoo), unfortunately, cat hanged:
cat /proc/tyton
^C^C^C^C^C^C^C^C^C^C
with the following syslog message:
[48483.431930] tyton: INFO: Inserting Module
[48483.459835] tyton: INFO: Analyzing Module List
[48483.464950] tyton: INFO: Analyzing Syscall Hooks
[48483.464958] BUG: unable to handle kernel paging request at 0000000000707261
[48483.464960] PGD 0 P4D 0
[48483.464964] Oops: 0000 [#1] PREEMPT SMP PTI
[48483.464968] CPU: 1 PID: 23664 Comm: kworker/1:0 Tainted: G O T 4.20.6-pentoo #1
[48483.464969] Hardware name: LENOVO 2330A54/2330A54, BIOS G2ETB2WW (2.72 ) 04/11/2018
[48483.464975] Workqueue: events init_module [tyton]
[48483.464981] RIP: 0010:strcmp+0x8/0x20
[48483.464983] Code: c8 c3 c6 44 07 ff 00 48 c7 c0 f9 ff ff ff eb 93 31 c0 eb c3 48 c7 c0 f9 ff ff ff c3 66 0f 1f 44 00 00 48 83 c7 01 48 83 c6 01 <0f> b6 47 ff 3a 46 ff 75 07 84 c0 75 eb 31 c0 c3 19 c0 83 c8 01 c3
[48483.464985] RSP: 0018:ffffb5aec2643df8 EFLAGS: 00010286
[48483.464986] RAX: ffff9b5dd537bbd8 RBX: ffff9b5dd537bbd8 RCX: 0000000000000000
[48483.464988] RDX: 0000000000000000 RSI: ffffffffc05aa4b1 RDI: 0000000000707262
[48483.464989] RBP: ffffffffc05aa4b0 R08: 0000000000000001 R09: 000000000000043e
[48483.464990] R10: ffffffffaafb2c08 R11: 0000000000000000 R12: 0000000000000000
[48483.464991] R13: ffffffffab23a880 R14: ffffffffc05ab0a8 R15: ffffffffc05ab0a0
[48483.464993] FS: 0000000000000000(0000) GS:ffff9b5dd6040000(0000) knlGS:0000000000000000
[48483.464994] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[48483.464995] CR2: 0000000000707261 CR3: 000000020f836006 CR4: 00000000001626e0
[48483.464996] Call Trace:
[48483.465002] find_subdir+0x30/0x40 [tyton]
[48483.465005] analyze_networks+0x6b/0x1a0 [tyton]
[48483.465009] ? __switch_to_asm+0x31/0x60
[48483.465012] init_module+0x2cf027/0x2cf070 [tyton]
[48483.465016] process_one_work+0x1d2/0x430
[48483.465018] worker_thread+0x28/0x3c0
[48483.465020] ? process_one_work+0x430/0x430
[48483.465023] kthread+0x10e/0x130
[48483.465025] ? kthread_park+0x90/0x90
[48483.465027] ret_from_fork+0x35/0x40
[48483.465029] Modules linked in: tyton(O) ctr ccm usbhid ipv6 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common vboxpci(O) vboxnetadp(O) vboxnetflt(O) iwldvm x86_pkg_temp_thermal mac80211 vboxdrv(O) kvm_intel iwlwifi kvm irqbypass i2c_i801 serio_raw efivars sdhci_pci cqhci cfg80211 sdhci mmc_core xhci_pci ehci_pci e1000e xhci_hcd ehci_hcd thinkpad_acpi rfkill pcc_cpufreq efivarfs
[48483.465052] CR2: 0000000000707261
[48483.465054] ---[ end trace c43046567c867fe4 ]---
[48483.465057] RIP: 0010:strcmp+0x8/0x20
[48483.465059] Code: c8 c3 c6 44 07 ff 00 48 c7 c0 f9 ff ff ff eb 93 31 c0 eb c3 48 c7 c0 f9 ff ff ff c3 66 0f 1f 44 00 00 48 83 c7 01 48 83 c6 01 <0f> b6 47 ff 3a 46 ff 75 07 84 c0 75 eb 31 c0 c3 19 c0 83 c8 01 c3
[48483.465061] RSP: 0018:ffffb5aec2643df8 EFLAGS: 00010286
[48483.465062] RAX: ffff9b5dd537bbd8 RBX: ffff9b5dd537bbd8 RCX: 0000000000000000
[48483.465063] RDX: 0000000000000000 RSI: ffffffffc05aa4b1 RDI: 0000000000707262
[48483.465065] RBP: ffffffffc05aa4b0 R08: 0000000000000001 R09: 000000000000043e
[48483.465066] R10: ffffffffaafb2c08 R11: 0000000000000000 R12: 0000000000000000
[48483.465067] R13: ffffffffab23a880 R14: ffffffffc05ab0a8 R15: ffffffffc05ab0a0
[48483.465069] FS: 0000000000000000(0000) GS:ffff9b5dd6040000(0000) knlGS:0000000000000000
[48483.465070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[48483.465071] CR2: 0000000000707261 CR3: 000000020f836006 CR4: 00000000001626e0
[48533.993850] tyton: INFO: Exiting Module
Based on the trace, I'm guessing that the strcmp here https://github.com/nbulischeck/tyton/blob/procfs-status/src/network_hooks.c#L29 somehow ends up with an invalid rbnode (which seems unlikely) or the act of you catting the file while it was locked made the kernel attempt to access memory it shouldn't have. Not completely sure as of right now, will give it a look in the morning.
I have rebooted and tried few more times. The crash happens right after I load the module. The "cat" just hands later because the module is dead
Can you check that this command succeeds completely?
ls -la /proc/self/net/{tcp,tcp6,udp,udp6,udplite,udplite6}
yes, no problem with it:
ls -la /proc/self/net/{tcp,tcp6,udp,udp6,udplite,udplite6}
-r--r--r-- 1 root root 0 Feb 11 19:02 /proc/self/net/tcp
-r--r--r-- 1 root root 0 Feb 11 19:02 /proc/self/net/tcp6
-r--r--r-- 1 root root 0 Feb 11 19:02 /proc/self/net/udp
-r--r--r-- 1 root root 0 Feb 11 19:02 /proc/self/net/udp6
-r--r--r-- 1 root root 0 Feb 11 19:02 /proc/self/net/udplite
-r--r--r-- 1 root root 0 Feb 11 19:02 /proc/self/net/udplite6
Alright then I'm assuming it's the rb_entry nodes. This really shouldn't be a problem, but I hope that my recent push has a solution.
Can you git pull the latest procfs-status branch and test to see if the insertion works?
still crashes
Feb 12 20:37:05 [kernel] tyton: INFO: Inserting Module
Feb 12 20:37:05 [kernel] tyton: INFO: Analyzing Module List
Feb 12 20:37:05 [kernel] tyton: INFO: Analyzing Syscall Hooks
Feb 12 20:37:05 [kernel] BUG: unable to handle kernel paging request at 0000000000707261
Feb 12 20:37:05 [kernel] PGD 0 P4D 0
Feb 12 20:37:05 [kernel] Oops: 0000 [#1] PREEMPT SMP PTI
Feb 12 20:37:05 [kernel] CPU: 0 PID: 32 Comm: kworker/0:1 Tainted: G O T 4.20.6-pentoo #1
Feb 12 20:37:05 [kernel] Hardware name: LENOVO 2330A54/2330A54, BIOS G2ETB2WW (2.72 ) 04/11/2018
Feb 12 20:37:05 [kernel] Workqueue: events 0xffffffffc041a000
Feb 12 20:37:05 [kernel] RIP: 0010:strcmp+0x8/0x20
Feb 12 20:37:05 [kernel] Code: c8 c3 c6 44 07 ff 00 48 c7 c0 f9 ff ff ff eb 93 31 c0 eb c3 48 c7 c0 f9 ff ff ff c3 66 0f 1f 44 00 00 48 83 c7 01 48 83 c6 01 <0f> b6 47 ff 3a 46 ff 75 07 84 c0 75 eb 31 c0 c3 19 c0 83 c8 01 c3
Feb 12 20:37:05 [kernel] RSP: 0018:ffff90fcc0173df0 EFLAGS: 00010286
Feb 12 20:37:05 [kernel] RAX: ffff88ea1537b518 RBX: ffff88ea1537b518 RCX: 0000000000000000
Feb 12 20:37:05 [kernel] RDX: 0000000000000000 RSI: ffffffffc041b4b1 RDI: 0000000000707262
Feb 12 20:37:05 [kernel] RBP: ffff88ea1537b498 R08: 0000000000000001 R09: 00000000000002e0
Feb 12 20:37:05 [kernel] R10: ffffffffa7fb2c08 R11: 0000000000000000 R12: ffffffffc041b4b0
Feb 12 20:37:05 [kernel] R13: ffffffffa823a880 R14: ffffffffc041c0a8 R15: ffffffffc041c0a0
Feb 12 20:37:05 [kernel] FS: 0000000000000000(0000) GS:ffff88ea16000000(0000) knlGS:0000000000000000
Feb 12 20:37:05 [kernel] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 12 20:37:05 [kernel] CR2: 0000000000707261 CR3: 000000004320a006 CR4: 00000000001606f0
Feb 12 20:37:05 [kernel] Call Trace:
Feb 12 20:37:05 [kernel] find_subdir+0x27/0x60 [tyton]
Feb 12 20:37:05 [kernel] analyze_networks+0x6b/0x1a0 [tyton]
Feb 12 20:37:05 [kernel] ? __switch_to_asm+0x31/0x60
Feb 12 20:37:05 [kernel] 0xffffffffc041a027
Feb 12 20:37:05 [kernel] process_one_work+0x1d2/0x430
Feb 12 20:37:05 [kernel] worker_thread+0x28/0x3c0
Feb 12 20:37:05 [kernel] ? process_one_work+0x430/0x430
Feb 12 20:37:05 [kernel] kthread+0x10e/0x130
Feb 12 20:37:05 [kernel] ? kthread_park+0x90/0x90
Feb 12 20:37:05 [kernel] ret_from_fork+0x35/0x40
Feb 12 20:37:05 [kernel] Modules linked in: tyton(O) usbhid ipv6 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common vboxpci(O) vboxnetadp(O) vboxnetflt(O) x86_pkg_temp_thermal iwldvm kvm_intel mac80211 kvm vboxdrv(O) irqbypass iwlwifi serio_raw efivars i2c_i801 sdhci_pci cqhci cfg80211 sdhci mmc_core thinkpad_acpi rfkill xhci_pci e1000e ehci_pci xhci_hcd ehci_hcd pcc_cpufreq efivarfs
Feb 12 20:37:05 [kernel] CR2: 0000000000707261
Feb 12 20:37:05 [kernel] ---[ end trace 2522267f43cef3c1 ]---
Feb 12 20:37:05 [kernel] RIP: 0010:strcmp+0x8/0x20
Feb 12 20:37:05 [kernel] Code: c8 c3 c6 44 07 ff 00 48 c7 c0 f9 ff ff ff eb 93 31 c0 eb c3 48 c7 c0 f9 ff ff ff c3 66 0f 1f 44 00 00 48 83 c7 01 48 83 c6 01 <0f> b6 47 ff 3a 46 ff 75 07 84 c0 75 eb 31 c0 c3 19 c0 83 c8 01 c3
Feb 12 20:37:05 [kernel] RSP: 0018:ffff90fcc0173df0 EFLAGS: 00010286
Feb 12 20:37:05 [kernel] RAX: ffff88ea1537b518 RBX: ffff88ea1537b518 RCX: 0000000000000000
Feb 12 20:37:05 [kernel] RDX: 0000000000000000 RSI: ffffffffc041b4b1 RDI: 0000000000707262
Feb 12 20:37:05 [kernel] RBP: ffff88ea1537b498 R08: 0000000000000001 R09: 00000000000002e0
Feb 12 20:37:05 [kernel] R10: ffffffffa7fb2c08 R11: 0000000000000000 R12: ffffffffc041b4b0
Feb 12 20:37:05 [kernel] R13: ffffffffa823a880 R14: ffffffffc041c0a8 R15: ffffffffc041c0a0
Feb 12 20:37:05 [kernel] FS: 0000000000000000(0000) GS:ffff88ea16000000(0000) knlGS:0000000000000000
Feb 12 20:37:05 [kernel] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 12 20:37:05 [kernel] CR2: 0000000000707261 CR3: 000000004320a006 CR4: 00000000001606f0
I'm using commit bf144b46b89df9ca60d440bf1f8e472efa017fec
Pushed a new commit. I just took a dive through 4.20.6 and nothing indicates this should fail. I'm on 4.20.7 on Arch and it works perfectly fine.
I pushed a new commit to error out if the string is null (shouldn't even be technically possible because it's static) and switched from strcmp to strncmp.
If this STILL doesn't fix whatever is causing this, I'll get a Pentoo VM and start testing for myself.
The problem is still there. I have tried a Pentoo VM, however the entire VM hangs after the module is loaded. Here are the last two lines in the syslog:
Feb 13 06:00:06 [kernel] [ 101.949006] tyton: INFO: Inserting Module
Feb 13 06:00:06 [kernel] [ 101.968965] tyton: INFO: Analyzing Module List
the quick steps to reproduce:
Perfect! Thank you for the steps to reproduce.
My reason for saying I am going to spin up a Pentoo VM was so that I could stop pushing half-baked potential bandaid fixes for this and wouldn't have to bug you to test it every day or so. My apologies up to this point.
@blshkv If you have free time, can you confirm if the recently merged patch fixes your issue?
sorry, what branch to test? I have tested the latest master 5c19411e181111774c7e2aca2b5c3d6fa6b8fe2e, but /proc/tyton is not there Edit: I'm on kernel 5.6.15 today
My apologies! It's been so long that I forgot the proc work was on a different branch. I'll ping you again when I've updated the other branch.
Hi, nice idea, I really like it.
I'm trying to test the tool on my Gentoo (Pentoo) Linux where I chose not to use systemd. It currently uses udev with openrc.
Is it possible to use any other available mechanisms and get notifications working?
Thanks