Closed cpg1111 closed 1 year ago
Apologies, I was on a plane with a faulty connection, it is ready for review.
@cpg1111 thanks for this detailed one. I have a bunch of questions, some of which is just what I don't understand.
First, is this capturing all of the data and the auxiliary data? I think so, but am unclear. I will comment inline.
Second, this implements only for linux via syscall, but not via mmap. For anything of scale, mmap is the preferred way of doing this. Is there an equivalent implementation for mmap?
Third, what is the equivalent for darwin? It also has syscall and mmap, although I do not know if the calls are the same?
I will ask the other questions inline. Maybe inline comments might make it easier?
See if this helps you.
The "official" (and pretty good) mmap guides is here.
The important part is when you open the socket, quoting from that page:
socket creation and destruction is straight forward, and is done the same way with or without PACKET_MMAP:
int fd = socket(PF_PACKET, mode, htons(ETH_P_ALL));
where mode is SOCK_RAW for the raw interface were link level information can be captured or SOCK_DGRAM for the cooked interface where link level information capture is not supported and a link level pseudo-header is provided by the kernel.
Looking at the code for go-pcap
, we open the socket the same way for both mmap and non-mmap, here:
fd, err := syscall.Socket(syscall.AF_PACKET, syscall.SOCK_RAW, int(htons(syscall.ETH_P_ALL)))
Per the instructions from the kernel.org page quoted above, we use mode = syscall.SOCK_RAW
, which should give us:
link level information can be captured
If you read the kernel.org page around the structures here, you can see that tpacket_v2 added the ability to get VLAN metadata information. Since this library already uses tpacket_v3 (it was quite the effort to get it right, if I recall correctly), that should make it available to us.
The processing of the individual mmap packets (which is the only part you should need to worry about) is here.
It reads the link layer info here, not sure if it is in there or the next level. But this should help.
See if this helps you.
The "official" (and pretty good) mmap guides is here.
The important part is when you open the socket, quoting from that page:
socket creation and destruction is straight forward, and is done the same way with or without PACKET_MMAP:
int fd = socket(PF_PACKET, mode, htons(ETH_P_ALL));
where mode is SOCK_RAW for the raw interface were link level information can be captured or SOCK_DGRAM for the cooked interface where link level information capture is not supported and a link level pseudo-header is provided by the kernel.
Looking at the code for
go-pcap
, we open the socket the same way for both mmap and non-mmap, here:fd, err := syscall.Socket(syscall.AF_PACKET, syscall.SOCK_RAW, int(htons(syscall.ETH_P_ALL)))
Per the instructions from the kernel.org page quoted above, we use
mode = syscall.SOCK_RAW
, which should give us:link level information can be captured
If you read the kernel.org page around the structures here, you can see that tpacket_v2 added the ability to get VLAN metadata information. Since this library already uses tpacket_v3 (it was quite the effort to get it right, if I recall correctly), that should make it available to us.
The processing of the individual mmap packets (which is the only part you should need to worry about) is here.
It reads the link layer info here, not sure if it is in there or the next level. But this should help.
Thanks for the references, I am actually quite familiar with those links heh. I was just under the impression I wasn't going to need to contribute that. You actually already capture this info in the syscall.Tpacket3Hdr
struct, and actually don't need any changes to do so, I am a little confused where you're dropping this info though. If you need me to contribute this, I'll push something up shortly.
I've added the VLAN tag to the mmap implementation as well.
I sent you more info than you needed, better than the alternative.
I don't know where I'm dropping it either. If you can figure out that bit as part of this PR, would be great.
I'd like to have a decent test harness for this one day.
that simple? Hats off.
Waiting for CI to go green. Will squash and merge for single commit. I would like to see Darwin support as well, but not going to hold it up for that.
Fwiw regarding Darwin support, this may not be an issue, as these changes were needed for Linux because the kernel deliberately strips VLAN tagging out. If I had recent Apple hardware, I'd be able to confirm, but I don't have any hardware that would run anything from the last 5 years for Darwin.
I do, but I am not running on any VLAN. If you have any good ideas as to how to test it, I can do so.
So in my home network, I have my homelab on a separate VLAN tagged with an id of 2 (i.e non-work things are on the default untagged VLAN and I just specify either eth ports or MAC addresses that should be in the second VLAN), so it would depend on your network hardware and its ability to manage VLANs, and with Apple, I'm guessing you'd need to specify by MAC address and not eth port unless you have a rj45 adapter for it.
Can you run tcpdump, capture some packets, paste the .pcap
file in here? I should be able to use that to feed in.
Sure thing, though I'm unsure this will actually test the necessary behavior, as it's a matter of what the Kernel does when seeing these packets. If you have a means of taking these packets and writing them over the wire, I suppose that'd work. All IPs within the 192.168.10.0/24
subnet should be in the tagged VLAN.
I see multiple pushes. Please comment here when it is ready for review.