evilsocket / opensnitch

OpenSnitch is a GNU/Linux interactive application firewall inspired by Little Snitch.
GNU General Public License v3.0
10.74k stars 498 forks source link

an alternate implementation #284

Closed nathants closed 4 years ago

nathants commented 4 years ago

big thanks to @evilsocket for brilliant use of libnetfilter_queue. ever since moving off macos i've missed littlesnitch and had no idea where to start until opensnitch dropped. thank you!

opensnitch didn't quite work for my use case, so i forked and rebuilt it as tinysnitch. it monitors both inbound and outbound connections, is more performant because of bcc/bpftrace, and has a smaller code footprint.

a few lessons learned: the ceiling of throughput with libnetfilter_queue is high, the biggest bottleneck is looking up pid/path/args, moving off of /proc is likely a good idea, and the number of go workers has no impact on performance.

hopefully some of these learnings can help people hacking the epic opensnitch! i'm using tinysnitch daily for six months now and it's feeling pretty solid, so i wanted to give back to the opensnitch community.

evilsocket commented 4 years ago

is more performant because of bcc/bpftrace

Do you have actual benchmarks to sustain this statement? If a python based implementation, which GIL and everything, is performing better than a Go software, that'd be impressive.

gustavo-iniguez-goya commented 4 years ago

Processing /proc is indeed costly. 20-60ms (even 80ms) per connection, every time. I narrowed it down to μs and ns by caching some parameters here and there, and the improvement is noticeable, so it doesn't surprise me if by using bcc/bpftrace is more performant, even using python. But in anycase, some stats would be nice :)

a fallback to ss is used when bpftrace misses, which happens occasionally.

what connections does your implemention miss? and what's the cost of falling back to ss? have you measured it?

By the way, does tinysnitch(bpf) intercept udp connections correctly? because as far as I can tell, most of udp connections are not written to /proc/net/udp|udplite (dhcp, ntp, smb, ...). You can also test it by running transmission for example.

Congrats for taking the step to offer us another option :)

evilsocket commented 4 years ago

@gustavo-iniguez-goya agreed, but then the trick is the cache, which doesn't have anything to do with the language or kernel extension being used.

evilsocket commented 4 years ago

i'm just trying to understand because i'm planning a rewrite/refactor of opensnitch and considering all the options

gustavo-iniguez-goya commented 4 years ago

a cache in this case was just a workaround for parsing /proc. Also /proc/net is not trustable, because many rootkits tends to place a hook before write the connection there. So I would not rely on /proc. I also would prefer go over python, but based just on a personal preference. It would be nice to compare it in both languages.

Thus there're 2 mechanism left: bpftrace/bcc family and linux audit.

ebpf/bcc/xdp are compatible from kernel 3.15 https://github.com/iovisor/bcc/blob/master/docs/kernel-versions.md (BPF attached to sockets 3.19, XDP 4.8). Red Hat backported it to 3.10 because RHEL 7.x use it, but no idea if all the features have been ported. The PoC @p- posted looked very promising, and it looks like all the *nix community is turning towards ebpf, but I think that it wouldn't work in many environments. Some kernels are not shipped with all the features enabled, but that probably will change in a near future.

On the other hand, there's the old-well known linux audit (>= 2.6.x). Only reports events, does not block connections, but I guess that it could be easily linked to nfqueue. The only caveat I see is that an auditd plugin should be created, because there can only be 1 program connected with the audit subsystem via netlink. Some have implemented their own daemon (https://github.com/slackhq/go-audit) which could ease the implementation. But it will also interfere with some setups and their rules (fedora, redhat, suse maybe). The performance of this option is something that intrigues me, specially if linked to nfqueue, but it will probably be more performant than parsing /proc.

nathants commented 4 years ago

@evilsocket here are some rough benchmarks. it's entirely possible that i'm simply using opensnitch wrong, but with it i get about the same throughput i get if i run tinysnitch always falling back against ss. hundreds of requests per second.

simplified pypy or golang can do 9-11k/sec, and tinysnitch can do 7-8k/sec while getting proc info from bcc/bpftrace.

@gustavo-iniguez-goya using only ss throughput is awful, hundreds per second. luckily its not often used. the misses are caused by bcc/bpftrace missing sometimes, and i have no idea how/why it happens. a better fallback than a subprocess call to ss could certainly be implemented, raising that worst case throughput. even ss sometimes can miss, which will result in a prompt without proc information. in practice it hasn't been enough of an issue for me to warrant much effort.

@evilsocket golang is undoubtedly faster than pypy, but in this case i believe the bottlenecks to be proc lookup. if a way were found to do lookups more efficiently, a golang implementation could certainly be faster. for me, anything better than 5k/sec is fine, and anything less than 1k/sec is not going to work.

an additional few differences to opensnitch i forget to mention:

nathants commented 4 years ago

i updated the benchmark to separate inbound and outbound connections, as inbound doesn't use bpftrace to resolve pid/path/args. turns out they are very similar. this makes me think that it's the netstat use not the /proc use that is the bottleneck in opensnitch. the same pid/path/args may make many connections, so churn on netfilter will be higher than on /proc.

the bcc/bpftrace use is all as subprocesses parsing their stdout, so it might be possible to integrate them upstream in a similar manner.

nathants commented 4 years ago

@gustavo-iniguez-goya udp is handled, both inbound and outbound. almost every outbound connection is preceded by a udp dns request.