add EPB extension epb_processid_threadid

RavuAlHemio commented 1 year ago

Add an Enhanced Packet Block extension to store a process ID and a thread ID.

Capture methods like Windows Network Trace include process and thread IDs in the captured data. The data makes most sense for outbound packets (the PID and TID of the originating process), but the TID might also be interesting when obtaining capture data while doing kernel debugging, where incoming packets are processed by different threads within the System process.

It appears that, according to the answers to this Unix StackExchange question, process and thread IDs on Unix systems generally fit into an unsigned 32-bit integer each; this is also true for Windows.

Should an operating system not have a concept of processes or threads, the value 0 can be stored for the respective field. I don't know of any operating system where process or thread ID 0 is valid.

guyharris commented 1 year ago

Should an operating system not have a concept of processes or threads, the value 0 can be stored for the respective field. I don't know of any operating system where process or thread ID 0 is valid.

Process 0 on Darwin-based OSes is the "kernel task", which is a multi-threaded task whose code runs entirely in kernel space and whose threads could send and receive packets. Other UN*Xes have a process 0; back in the old old old old old days it was an in-kernel swapper process, but there may be other UN*Xes where it can perform networking.

guyharris commented 1 year ago

Perhaps a better scheme would be to have separate process ID and thread ID options; if the option in question is present, the OS is presumed to have a notion of processes and threads, and the process or thread that would receive or was sending the packet is known. If there is no concept of process or thread, or if the receiving process or thread is unknown, omit the option.

mcr commented 1 year ago

Process 0 on Darwin-based OSes is the "kernel task", which is a multi-threaded task whose code runs entirely in kernel space and whose threads could send and receive packets. Other UNXes have a process 0; back in the old old old old old days it was an in-kernel swapper process, but there may be other UNXes where it can perform networking.

Do we expect Darwin systems to have their kernel-task write PCAPNG files? (I think not) I'm pretty sure that the swapper process isn't going to be involved, so it sure seems safe to use 0 as invalid. I am agnostic about separate process/thread IDs. If we need them, that's fine with.

guyharris commented 1 year ago

Do we expect Darwin systems to have their kernel-task write PCAPNG files?

Do we expect packets sent by or received by the kernel task to show up in pcapng files? If there are any such packets, I'd expect them to show up the same way other process's packets show up.

Activity Monitor isn't showing kernel_task as having sent or received any packets. iStat Menus, however, shows it as sending and receiving a lot of packets when a VMware virtual machine is doing network I/O; I don't know whether that's going through the new virtualization mechanisms in XNU, or through BPF (at least at one point, VMware did network device emulation using BPF to send and receive link-layer packets), or what, and I don't know why Activity Monitor and iStat Menus disagree on this.

I'll have to look at the XNU source to see if any kernel threads would do network I/O. (The one I created doesn't, it just tries to unmount automounted file systems.)

guyharris commented 1 year ago

Here's a line of output from macOS tcpdump, from a capture on all network interfaces using PKTAP rather than BPF and providing a big pile of metadata, asking it to dump all metadata; the capture took place while I was doing some networking on a Linux VM in VMware:

Process Information Block pid: 0 proc_name: kernel_task
13:53:12.694804 (en0, proc kernel_task:0:, eproc vmnet-natd:65879:, svc BE, in, so, flowid 0x895c471a, ttag 0x0) IP (tos 0x0, ttl 52, id 31338, offset 0, flags [DF], proto TCP (6), length 580)
    172.64.41.4.https > 192.168.1.3.61862: Flags [P.], cksum 0x2505 (correct), seq 4896:5424, ack 582, win 8, options [nop,nop,TS val 353030640 ecr 3943903301], length 528

I'm not sure what the difference between "proc" and "eproc" is. The description I concocted of the LINKTYPE_PKTAP link-layer header, from looking at macOS source, speaks of an "effective" process ID and name. It also says that the PID may be 0 if the process is unknown, but, apparently, it may also be 0 if the process is the kernel task.

RavuAlHemio commented 1 year ago

I'm fine with splitting up the information into separate options. Do we want to handle Darwin's "effective PID" case somehow?

I can think of the following variants:

one 8-byte option for process ID and thread ID (with 0 as unknown; no EffPID handling, no differentiation between PID 0 and unknown PID) -- this is the currently merged variant
one 4-byte option for process ID; one 4-byte option for thread ID (0 means PID/TID zero, absence of the option means unknown PID/TID; no EffPID handling)
one 4-byte or 8-byte option for process ID followed by, if known, effective process ID (0 means PID zero, {0, 0} means PID zero and EffPID zero, absence of the option means unknown PID and unknown EffPID); one 4-byte option for thread ID (0 means TID zero, absence of option means unknown TID)
one 4-byte option for process ID, one 4-byte option for thread ID, one 4-byte option for effective process ID (0 means PID/TID/EffPID zero, absence of the option means unknown PID/TID/EffPID)

I don't think it's entirely absurd that a kernel process with PID 0 might originate network packets that may end up in a PCAPNG file. I certainly wouldn't want to make it impossible for PCAPNG to handle such cases.

guyharris commented 1 year ago

So the way Apple handles this is that they added a Process Information Block, with a local block type; a (reverse-engineered?) description of a PIB is in the Wireshark source.

Apple also has some custom options for the EPB, including options for the PIB ID of the process and the PIB ID of the "effective process".

They don't bother with thread information. Threads can have names in Darwin, but aren't guaranteed to have them. (A thread has a name if code running in that thread has called pthread_setname_np() to set the name.) They can also be created and destroyed, so a PIB can't include a thread list; I suppose if Apple wanted to provide thread information they could add a Thread Information Block.

mcr commented 1 year ago

Guy Harris @.***> wrote:

Do we expect Darwin systems to have their kernel-task write PCAPNG files?

Do we expect packets sent by or received by the kernel task to show up

No, because they'd have to have some userspace context in order to open a file or socket.

guyharris commented 1 year ago

No, because they'd have to have some userspace context in order to open a file or socket.

Why would they have to have userspace context for that? Kernel-mode NFS servers don't have that issue.

IETF-OPSAWG-WG / draft-ietf-opsawg-pcap

add EPB extension epb_processid_threadid #132