Closed fearful-symmetry closed 2 months ago
using fentry/fexit this is (in theory) possible by reading the
iov_iter
struct. Where this gets complicated is pre-trampoline kernels---kretprobe
doesn't have access to the function args, meaning we can't get the iovec data after it's been populated.
We could probably do the set_state/get_state trick that we have in Helpers.h which maintains state between entry and ret so that we get access to the input argument pointers in the handler for ret.
There's some alternatives here, most notably using TC probes: https://ebpf-docs.dylanreimerink.nl/linux/program-type/BPF_PROG_TYPE_SCHED_CLS/ , which I haven't experimented with yet.
I believe this is where TC runs on the RX path:
https://elixir.bootlin.com/linux/v6.10.7/source/net/core/dev.c#L4049
so this is early, at L2, before deliver_skb()
and before it reaches the IP layer. We probably cannot easily map it to the receiving process without having to maintain our own bookkeeping. We could experiment with bpf_sk_lookup_tcp helper and see if its reliable - if it works as advertised it can get the listening socket just from the 5tuple so that could allow us to extract the PID easier probably. There's also bpf_sk_fullsock()
which does...something.
Another thing is at this early L2 stage we don't even know if the packet will actually reach the application, there is quite a lot of filtering and drop scenarios in between.
So I think I'm leaning towards the hooks as they are done in this PR if we can easily get packet data there and parse it. I assume we want to filter for port 53 in the kernel and avoid creating events for other UDP packets otherwise there could be a lot.
@stanek-michal so, I did some tinkering over the three-day weekend and came to a lot of the conclusions you did, namely that we could use a state getter/setter for kretprobes, and also the problem with finding the userspace pid with a TC/classifier probe.
I'm leaning towards the getter/setter with kretprobe idea for this, since it's a little simpler (we don't need some other mechanism for tracking what userland process owns the pid).
Taking this out of draft since I think this is a fairly decent framework going forward if/when we want to parse DNS packets.
This won't work for unconnected sockets, as the socket doesn't have an address, if you run the following program you will see the packets don't show up. Worth noting that unconnected udp sockets are likely more common than connected.
You can use the following program to verify:
./udpsend -c 8.8.8.8 # connected
./udpsend 8.8.8.8 # unconnected
#define _GNU_SOURCE /* program_invocation_name */
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <err.h>
static void
usage(void)
{
fprintf(stderr, "usage: %s [-c] addr4",
program_invocation_short_name);
exit(1);
}
int
main(int argc, char *argv[])
{
struct sockaddr_in sin;
int sock, ch, do_connect;
ssize_t n;
uint64_t buf[] = {
0xdeadbeef, 0xdeadbeef, 0xdeadbeef, 0xdeadbeef,
0xdeadbeef, 0xdeadbeef, 0xdeadbeef, 0xdeadbeef
};
do_connect = 0;
while ((ch = getopt(argc, argv, "ch")) != -1) {
switch (ch) {
case 'c':
do_connect = 1;
break;
default:
usage();
}
}
argc -= optind;
argv += optind;
bzero(&sin, sizeof(sin));
if (inet_aton(argv[0], &sin.sin_addr) == 0)
errx(1, "inet_aton(%s)", argv[0]);
sin.sin_port = htons(53);
sin.sin_family = AF_INET;
if ((sock = socket(AF_INET, SOCK_DGRAM, 0)) == -1)
err(1, "socket");
if (do_connect) {
if (connect(sock, (struct sockaddr *)&sin, sizeof(sin)) == -1)
err(1, "connect");
n = write(sock, buf, sizeof(buf));
if (n == -1)
err(1, "sendto");
else if (n != sizeof(buf))
errx(1, "sendto: shortcount");
return (0);
}
n = sendto(sock, buf, sizeof(buf), 0,
(struct sockaddr *)&sin, sizeof(sin));
if (n == -1)
err(1, "sendto");
else if (n != sizeof(buf))
errx(1, "sendto: shortcount");
return (0);
}
As I mentioned to you yesterday, I think this is still not a good place to tap, I'll do a proper review later tonight and have more comments, but if you use the program above with events trace you can spot the issue.
Alright, so, reverted the commit that swapped over to the *skb*
methods. This is back to using udp_sendmsg
and udp_recvmsg
. Will continue to investigate alternatives.
Alright, tested on 6.10 and 5.15...
I think we have settled on the skb functions, which means the previously excellent description needs some love.
I'll have a look on this tonight
So, did a bit more testing with large DNS queries using this:
dig +noall +answer +comments @166.84.7.99 512.size.dns.netmeister.org a
Haven't noticed any weird paging behavior.
@haesbaert totally forgot about the earlier udp send example you wrote, so I switched to that.
I haven't tested the last iteration but looks good to me, good work!
see: https://datatracker.ietf.org/doc/html/rfc1035#autoid-40
This PR adds basic support for DNS monitoring, using the
ip_send_skb
andskb_consume_udp
functions in the kernel networking stack. The logic here is very basic, and includes no parsing of actual DNS data; after spending the better part of the day reading RFCs and looking at other implementations, I've decided that parsing actual DNS requests/responses is complex enough (and risky enough) that we should probably use an external library for parsing DNS data. It's one of those "getting it 80% working is 20% of the code" protocols.As a result of this, this PR just creates a DNS packet that just provides basic network metadata and and a buffer for the packet data.