kpcyrd / sniffglue

Secure multithreaded packet sniffer
https://crates.io/crates/sniffglue
GNU General Public License v3.0
1.08k stars 94 forks source link

Exits for no reason by itself when listening on `lo` or `any`. #127

Open xtaran opened 6 months ago

xtaran commented 6 months ago

Hi,

while trying out sniffglue (at version 0.15.0 on Debian GNU/Linux Unstable, package version 0.15.0-7), I noticed that, when using either the interface lo (local loopback device) or the virtual interface any (i.e. sniff on on all interfaces), it outputs a bunch of packets and then exits (even with exit code 0, so not a crash?) for no obvious reason reproducibly after a seemingly random number of packets (so far I've counted 28, 49, 106 and 116 using sniffglue lo | wc -l).

sniffglue

So far, when I used it on any, it also only showed packets from the lo interface before it exited. But that might have been just chance.

kpcyrd commented 6 months ago

I'm having trouble reproducing this, this works fine for me:

doas podman run -it --rm --net=host --privileged debian:sid sh -c 'apt update && apt dist-upgrade -y && apt install -y sniffglue && sniffglue lo -vv'

if sniffglue exits this usually means all worker threads have terminated, for example because the handle to the network device got closed, but possibly also because they got killed by seccomp (this usually should give you at least some output though).

To verify this, check your system logs, try strace -f sniffglue lo and check for thread terminating due to signals, or try running sniffglue --insecure-disable-seccomp lo to check if this happens unrelated to seccomp.

If it is related to seccomp, you need to run strace -f sniffglue lo 2> strace.log and search for = ? from the bottom-up to identify any syscalls that have been interrupted.

xtaran commented 6 months ago

Thanks for trying to reproduce and for these suggestions!

I think the strace found something:

[…]
[pid 24096] write(1, "\33[33m00:00:00:00:00:00 -> 00:00:"..., 14500:00:00:00:00:00 -> 00:00:00:00:00:00, [udp   ] 127.0.0.1:56349        -> 127.0.0.1:53           [dns] req, (AAAA, "reform.n[…]")
 <unfinished ...>
[pid 24102] <... rt_sigprocmask resumed>NULL, 8) = 0
[pid 24096] <... write resumed>)        = 145
[pid 24102] madvise(0x7f19da3fd000, 2076672, MADV_DONTNEED <unfinished ...>
[pid 24096] write(1, "\33[33m00:00:00:00:00:00 -> 00:00:"..., 12200:00:00:00:00:00 -> 00:00:00:00:00:00, [udp   ] 127.0.0.1:53           -> 127.0.0.1:56349        [dns] resp, []
 <unfinished ...>
[pid 24102] <... madvise resumed>)      = 0
[pid 24096] <... write resumed>)        = 122
[pid 24102] exit(0 <unfinished ...>
[pid 24096] setsockopt(4, SOL_PACKET, PACKET_RX_RING, {tp_block_size=0, tp_block_nr=0, tp_frame_size=0, tp_frame_nr=0}, 16 <unfinished ...>
[pid 24102] <... exit resumed>)         = ?
[pid 24096] <... setsockopt resumed>)   = -1 EBUSY (Device or resource busy)
[pid 24102] +++ exited with 0 +++
munmap(0x7f19db05a000, 4194304)         = 0
munmap(0x7f19db8bf000, 266240)          = 0
close(3)                                = 0
close(4)                                = 0
sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=8192}, NULL) = 0
munmap(0x7f19db900000, 12288)           = 0
exit_group(0)                           = ?
+++ exited with 0 +++

I suspect that this EBUSY might be what triggered sniffglue to exit.

kpcyrd commented 6 months ago

I didn't have time to debug this in depth yet, but according to man setsockopt Linux does not document the EBUSY error code for this syscall:

ERRORS
       The setsockopt() function shall fail if:

       EBADF  The socket argument is not a valid file descriptor.

       EDOM   The  send and receive timeout values are too big to fit into the timeout fields in the socket struc‐
              ture.

       EINVAL The specified option is invalid at the specified socket level or the socket has been shut down.

       EISCONN
              The socket is already connected, and a specified option cannot be set while the socket is connected.

       ENOPROTOOPT
              The option is not supported by the protocol.

       ENOTSOCK
              The socket argument does not refer to a socket.

       The setsockopt() function may fail if:

       ENOMEM There was insufficient memory available for the operation to complete.

       ENOBUFS
              Insufficient resources are available in the system to complete the call.

       The following sections are informative.

If you feel like debugging this, you'd need to figure out which undocumented error-case you're reaching in the Linux kernel.