Experience with "unknown" connections.

Ph0rk0z commented 4 years ago

Ubuntu/mint 18.04

Recently upgraded to RC6. In default daemon config I changed the rule to deny, of course. Also enabled intercept unknown.

My openvpn manager, qomui caused a popup from unknown for VPN connection. Also a popup from unknown for SSDP (port 1900)... which is probably chromium. Both were running while I updated/compiled/restarted the service.

When I configured block + intercept unknown: false these would fail silently. By default config they go through and I was unaware of the situation until now.

This is good news that we are catching all connections now but maybe defaults are very permissive. Also created permanent rules now to block SSDP/allow vpn from "unknown"; bad idea? Either way, great job on the continued development!

gustavo-iniguez-goya commented 4 years ago

This is good news that we are catching all connections now but maybe defaults are very permissive.

Yes, I set it permissive to avoid problems, because if the GUI is not installed there may be network problems with some app if we block it. And if there're problems just after installing it I know what people will say -> "opensnitch sucks".

Also created permanent rules now to block SSDP/allow vpn from "unknown"; bad idea?

As long as you block by IP + dst port + user, knowing what you're doing, I guess it'll be ok. For example, when you connect/disconnect a USB device, colord broadcasts packets to the network. Apparently for discovering network scanners. If that feature can't be disabled or you don't use it, you can block it.

Another program that broadcasts packets is minissdpd, to 239.255.255.250:1900. I personally don't use it, and I didn't even know that it was installed nor that it broadcasted packages.

But people should have into account that it may break some service.

Either way, great job on the continued development!

Thanks to you for reviewing commits and changes, it was very helpful! :)

Ph0rk0z commented 4 years ago

Ha, I just found out about colord and some braille keyboard program that gets installed.

But also some weird stuff: Dnscrypt proxy connects on its own user and just allowing the bin didn't cover it. With chrome open and vpn reconnecting I got unknown connecting to api.twitter.com:443 which I suspect is chromium since I had a twitter tab in the background. But how isn't it caught as that?

For this mode maybe we need a quick allow/deny button? Some of it I don't want to create a rule for (or can't, eg port 443) but also don't want to have the connections silently fail or connect. Kind of an odd problem, I'll keep using it and see how often requests come up.

gustavo-iniguez-goya commented 4 years ago

But also some weird stuff: Dnscrypt proxy connects on its own user and just allowing the bin didn't cover it.

Well, it's working for me on 2 computers. But for example when you connect/disconnect from the network, or when you switch on the computer, many times it doesn't obtain the process name of a connection. I haven't look into that problem yet, mainly because once you get rid of those connections it starts working fine.

With chrome open and vpn reconnecting I got unknown connecting to api.twitter.com:443 which I suspect is chromium since I had a twitter tab in the background. But how isn't it caught as that?

Chrome and others software creates connections from threads, which have a different PID from its parent, but they're not a new process and as such, the information is stored under /proc//tasks/. Currently we're not getting info from there so that behaviour could be due to that. I added it, but parse all that info it's quite costly and it's as far as I can tell, rare to find the sockets(inode) stored there.

In these situations, what I do is allow or deny connections for 30s to the port/IP which is causing the problem (a generic temporal rule). Once those connections are destroyed/closed all starts working again.

Probably you're using just proc, but using audit or ftrace should be more accurate (-process-monitor-method audit). If you want to try audit and see if works better:

apt install auditd
./opensnitchd -rules-path /etc/opensnitchd/rules -ui-socket unix:///tmp/osui.sock -process-monitor-method audit -debug

TheSolution I think would be to add a BPF program to intercept all connections correctly. I'll end up adding it I guess.

Ph0rk0z commented 4 years ago

I'll give it a go. But yea, it's come up more often. Get unknown to port 67, etc and a lot of popups that I can't make rules for. The only other option is blanket deny/allow of those connections.

Ph0rk0z commented 4 years ago

Today I have tried using ftrace and it's similar in behavior. Child processes/threads aren't being picked up.

I hesitate to use auditD because from what I see it is log based and I don't want to write to my ssd constantly.

gustavo-iniguez-goya commented 4 years ago

You can disable logs, it only reads events from the unix socket /var/run/audispd_events: /etc/audit/auditd.conf: write_logs = no

logs keep appearing in journal though, I need to figure out how to disable it as well.

Ph0rk0z commented 4 years ago

I have the journal logging to ram only so that would work if it doesn't write its own file.

Ph0rk0z commented 4 years ago

Tried auditd... no change, they still come up unknown.

gustavo-iniguez-goya commented 4 years ago

:( it's hard to debug it without traces. Maybe I should add them for DEBUG level at least. It'd provide helpful insights. In any case, verify please that it's working:

relaunch the daemon with -process-monitor-method audit and -debug :

1) should be added 2 rules with key=opensnitch

# auditctl -l
-a always,exit -F arch=b32 -S socketcall -F ppid!=17304 -F pid!=17304 -F a0=0x1 -F key=opensnitch
-a always,exit -F arch=b64 -S socket,connect -F ppid!=17304 -F pid!=17304 -F key=opensnitch

2) see if this trace appears:

"PID found via audit event"

Ph0rk0z commented 4 years ago

Added those rules, enabled audit dispatcher unix socket plugin. Do not get an events file in /var/run and that is what I see in the opensnitch log. It goes back to proc.

gustavo-iniguez-goya commented 4 years ago

ok, I see. Just to be sure that auditd + af_unix is running on your machine corectly:

stop opensnitchd
check that there're no audit rules (auditctl -l)
add audit rules manually:

auditctl -a always,exit -F arch=b64 -S socket,connect -F key=opensnitch
connect to the af_unix socket and see if there're audit events (/var/run/audispd_events should exist):

socat - UNIX-CONNECT:/var/run/audispd_events

you should see events like this one:

type=SYSCALL msg=audit(1584570207.864:2824263): arch=c000003e syscall=42 success=no exit=-115 a0 (...)

On the other hand, I've found a way to reproduce this behaviour by removing the default gateway and adding a route through a non-existent IP:

ip r del default ip r add default via 192.168.123

So I'm going to start debugging it to see if I can fix it somehow. A really interesting bug!

Ph0rk0z commented 4 years ago

I couldn't get audit to create the socket so I compiled it from source. Now the dispatcher runs and opensnitch doesn't have the error anymore. It claims to find processes via audit and I see the messages tagged opensnitch in journal.

Unfortunately when I connect with socat I don't see any messages which is opposite to the opensnitch log. I also still see the same "unknown" processes just like before.

Another issue after suspend is that I get a que of firewall popups, eg user dnscrypt connecting to port 443... but whether I allow or deny the popup repeats until the queue is exhausted. Not sure how it works with conflicting rules, eg if I deny one then allow another.

gustavo-iniguez-goya commented 4 years ago

Unfortunately when I connect with socat I don't see any messages which is opposite to the opensnitch log. I also still see the same "unknown" processes just like before.

While the daemon is running you won't be able to read messages from the socket with socat. I'll add more debug messages to see if it helps to understand what's going on your system.

Another issue after suspend is that I get a que of firewall popups, eg user dnscrypt connecting to port 443... but whether I allow or deny the popup repeats until the queue is exhausted. Not sure how it works with conflicting rules, eg if I deny one then allow another.

Yes, that's quite annoying. That's another thing to debug, but maybe both problems are related.

Ph0rk0z commented 4 years ago

I did leave the socket connected and got some messages there, probably after coming out of suspend.

On resume it's getting pretty crazy. Having a default rule to deny stuff either gets denied if I don't watch or fills up the netflter queue and causes dropped packets. So almost only choice is to allow the unknowns to complete.

gustavo-iniguez-goya commented 4 years ago

After debugging this issue, it looks like most of the "unknown" connections are those which are closed or resetted due to connectivity lost (those with FIN or RST flags set). Probably the connection gets closed and the thread/process exits in order to reestablish it, so when we go look for it in procfs, the process is already gone.

I'm starting to understand what's going on, but I don't see a solution for now. I'll keep debugging it.

FmT0 commented 4 years ago

I have a lot of theses "unknown" connections with transmission p2p torrent software, as I have to put the allow flag in the default-config if I want to use it. (maybe ipv6 related ?) Hope it helps you debugging.

gustavo-iniguez-goya commented 4 years ago

yeah, transmission is the kryptonite of this working mode :(

I've got a workaround for these situations. At least for Transmission, it minimizes a lot the amount of dialogs to allow/deny connections. I need to test it further and see if it also works after come back from hibernate/suspend or gateway change.

gustavo-iniguez-goya commented 4 years ago

ok, brief update on these issues.

The problem with transmission is that it works as a server:

UNCONN     768      0      0.0.0.0:51413       0.0.0.0:*    users:(("transmission-gt",pid=15852,fd=15))

15275 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 17
15275 fcntl(17, F_GETFL)                = 0x2 (flags O_RDWR)
15275 fcntl(17, F_SETFL, O_RDWR|O_NONBLOCK) = 0
15275 setsockopt(17, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
15275 setsockopt(17, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
15275 bind(17, {sa_family=AF_INET, sin_port=htons(51413), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
15275 listen(17, 128)

So when a connection is established (well, a UDP reply I guess), netfilter redirects us the connection with the following details:

51413:192.168.0.2 -> xxx.torrent.xx:1234

However, if you ask the kernel for such concrete connection (srcport+srcip+dstip+dstport), it returns nothing. It doesn't exist either in /proc/net/tcp|udp.

A similar problem exists with ntpd, for example:

new connection => 123:192.168.1.109 -> 116.203.151.74:123

kernel responses querying by port 123:

netlink response 123:0.0.0.0 -> 0.0.0.0:0 inode: 44066374 state: close
netlink response 123:127.0.0.1 -> 0.0.0.0:0 inode: 44066377 state: close
netlink response 123:192.168.1.109 -> 0.0.0.0:0 inode: 44066379 state: close

In these scenarios I can dump the kernel connections by source port, and get the connection + inode with high reliability, because in that precise moment, the source port is unique (it should...). That approach indeed solves the issues with transmission and other services like colord or dleyna-server-service.

However, in some (edge?) cases, querying the kernel for connections by source port, returns several connections with the same source port:

new connection => 41908:127.0.0.1 -> 127.0.0.1:44444

netlink response: 41908:127.0.0.1 -> 127.0.0.1:44444 inode: 44624393 - multicast: false unspecified: false linklocalunicast: false ifaceLocalMulticast: false GlobalUni: false netlink response: 41908:192.168.1.109 -> 216.58.201.174:443 inode: 44622044 - multicast: false unspecified: false linklocalunicast: false ifaceLocalMulticast: false GlobalUni: true

This causes in some cases to display the user with an incorrect information (process name).

But all in all, I think it's working better than before.

gustavo-iniguez-goya commented 4 years ago

Regarding the problems @Ph0rk0z was having, I think I can tune the behaviour a little bit.

Now that I can get inodes with more reliability, if those outgoing connections don't exist in kernel, we can drop or accept them (by using the config option: default_action). Those connection might be FIN or RST requests to close connections (for example when you loose network connectivity or when you return from hibernation/suspend), or socket RAW connections like the ones nmap creates.

The only "unknown" connections that we'd show to the user would be the ones with unknown PIDs.

The downside is that it'll block or accept some valid connections without prompting the user potentially breaking some app (nmap, some colord broadcasts, ...).

Ph0rk0z commented 4 years ago

Well I dunno if that change will make a difference or not. I already tried the latest commit. Yes I get less popups but mostly I think it's from creating port based rules.

On that topic, my wife keeps hitting deny for session to port 443, killing browsing. I then have to re-start opensnitch and the gui. Wish there was a way to remove/disable rules from the GUI.

Now having tried 3 machines, all of them get "unknown" popups but the desktops don't get it as bad due to stable network. I also had "variety" desktop changer freely download images but maybe it uses wget? I never got notified.

rules.csv.txt

FmT0 commented 4 years ago

First, thank you for your hard work. As you said, some server style connection make it a problem (samba transmission dns)

With latest commit, I've tried the "Intercept Unknow" again with proc method. Diff with regular allow need some new rules :

ff02::1:2 port 547 for getting ipv6 (DHCPv6)
192.168.* port 137 to get samba browsing right. smb on ip adress doesn't require this but file manager will always asking for them.
transmission still need a rule based on tracker port. DHT needs another port too. Port 1900 to 239.255.255.250 to open port on gateway.

If I deny for example samba port 137 (no rules), I'll get more unknow connections on dns server port 53 (doesn't see it's a dns request) so, in the end have to make a rule for DNS requests.

I'm not using avahi/colord so I don't have all the others broadcast and make things easier I feel. Probably would be great to add zone like intranet multicast broadcast and gateway to make allow/deny rules simpler.

gustavo-iniguez-goya commented 4 years ago

Please, try latest changes and let me know if it makes any difference in those scenarios. I've also added more debug messages in order to help debugging these issues.

Run it with -debug and upload the logs detailing the problem. Hopefully it'll provide some insights.

By the way, there's a known bug: from time to time some outgoing connection may be attributed to the wrong process.

I'll take a look at this problem this week.

I have some (more) questions for you guys: what kernel do you use? do you use any containerized mechanism such as firejail? dnscrypt? tor? vpns?

With some VPNs I'm noticing that we interfere and it gets disconnected. Only if the traffic is queued until a connection dialog is closed.

Ph0rk0z commented 4 years ago

-5.3 Kernel from 18.0.4 HWE edge -Firejail mainly to block internet so no issues there -dnscrypt as the resolver -qomui for VPN which uses openvpn that never gets attributed to any process

I will try to log some more I guess, the problem with the log is it will contain all the IPs I connect to.

gustavo-iniguez-goya commented 4 years ago

ok, yes, you're right about the IPs. It's hard to strip all the IPs and domains. You can email me the logs if you prefer.

Thank you for the info!

Ph0rk0z commented 4 years ago

The other thing is I didn't find anything when trying previously. Just showed what action I took when it found the unknown connection. I was able to deduce it was using audit and that it saw connections but not why sub processes were coming back unknown.

I have to scrape up the other errors I notice in journal and syslog, eg some IPTABLES rule that fails and when netqueue filter is full.

FmT0 commented 4 years ago

5.5.13 manjaro kernel, nothing containerized Since last commit, everything seems to be detected as it should, no need to add newer rules for DHCPv6 (now detected as network manager), samba, Transmission (tracker/ DHT/port mapping)

Imap timeout connection from thunderbird is not shown anymore (as it should). And unknown connections to 443 do not seems to happen anymore from suspend.

Remain unknown connection is only when I wake up from suspend to detectportal.firefox.com so it's not a big deal.

Only change with yesterday is transmission with latest commit is crashing with " invalid %N$ use detected Abandon (core dumped) " error after a few seconds when torrent is started. Before it was crashing with the same error only when I was putting "InterceptUnknown": true. Now crashing with both options. It's not a big problem since stable transmission is not crashing in both mode but should be noted.

Putting "InterceptUnknown": false with "defaultaction" deny should work flawless now (needs some testing) so thank you for your great work this far :+1:

gustavo-iniguez-goya commented 4 years ago

Those are really great news FmT0! let's see how it goes.

Regarding the transmission segfault, maybe you can report it to them. I guess we're causing some unexpected behavior when closing connections.

Ph0rk0z commented 4 years ago

Loaded the latest commit, deleted all rules. I can also say it's gotten better on the "unknown" front.

opensnitchd.log

I have not had one yet and my VPN is now detected as openvpn.

This is the error I got previously and it convinced me to delete my rules... I haven't seen it again. log.txt

One thing of note is that I saw my opensnitchd log was 32MB after a few days of running.. and that was not in debug mode. Can that be reduced?

I also saw that python for instance will connect for bin X Y and Z, will a rule based on process only apply to python itself or python + thing its running?

Overall great improvement. I think it caught "variety" using python too.

gustavo-iniguez-goya commented 4 years ago

This is the error I got previously and it convinced me to delete my rules... I haven't seen it again. log.txt

That's not an error. I mean, it fails because it can't delete the rules prior to run (just in case we crashed and left the rules loaded), but it adds the rules correctly. I'll silence that log in the future.

One thing of note is that I saw my opensnitchd log was 32MB after a few days of running.. and that was not in debug mode. Can that be reduced?

That's a lot. By default it should be working on INFO level if no logging parameters are added to the command line (-important, -debug, -warning, etc..); can you check if you have any log option added to the command line in the .service file? (/etc/systemd/system/multi-user.target.wants/opensnitchd.service, the line ExecStart=/usr/... -).

Also the logrotate file is configured to rotate files weekly or if the log grow up to 50MB, so maybe you can tune it.

I also saw that python for instance will connect for bin X Y and Z, will a rule based on process only apply to python itself or python + thing its running?

Unfortunately no :( . I ran into this problem a few days ago. Maybe we should have another option to filter by command line, and that way we could filter by Proc Path "python" AND Proc Arguments "/path/to/file.py".

Ph0rk0z commented 4 years ago

Nope, no extra commands in the service. I've been adding -debug there when debugging. Will see what happens this week.

Another thing I just ran into:

Left the computer alone for couple hours, wake it up:

dhclient (yay! got the name) connecting to port 67... allow for session dhclient connecting to port 67 ... allow for 30s? dhclient connecting to port 67 .. deny for 30s? and so on until the queue is exhausted... on cursory glance I didn't find "netfilter full dropping packets" messages like before so that's good

filter by command line

That sounds good, should work for many things especially with wildcards.

I should check if using /proc works as well as audit now too, not a single unknown connection all day.

eta: Log file grew to 1mb over night. mostly from blocking chromium SSDP requests even when I disabled the media router/mdns

gustavo-iniguez-goya commented 4 years ago

If you think that this change is stable enough, I'll release a new version. Or at least, not worse than what we had before.

By the way, I've just realized that we're already able to filter by process command, so I just need to allow it from the GUI. Besides "From this process", there could be an option like "From this command", and if you expand the advanced view, or a combo box with options like "curl -L.", "curl -L.1.1.1.1" ... or a text box to insert the rule.

Right now if a python app open a connection you can only filter by python path.

Ph0rk0z commented 4 years ago

It seems stable to me. Have not had an unknown in a while. I figured out compiling never made a logrotate but the package does. I now set rotate 0 and limit of 1mb so should be ok log wise.

gustavo-iniguez-goya commented 4 years ago

Wish there was a way to remove/disable rules from the GUI.

I've got something already working. I guess that we should have a GUI to edit rules visually, but I'll start by allowing to remove rules from the GUI, disable/enable interception...

FmT0 commented 4 years ago

Well I know it's not really in the "unknown" connection but I wonder if we can have a silent mode where we have made our rules and all the connections that would put a "allow deny" window will get a deny with a Default_silent_action instead, and when we need it, we can go in choose mode switch to ask what we do for the unknown connection we didn't made a rule for...

Ph0rk0z commented 4 years ago

After these latest patches, the amount of times I get popups is shrinking. I think if you turn off the UI and put the rules to deny in the daemon config it will just silently block connections.

gustavo-iniguez-goya commented 4 years ago

One thing of note is that I saw my opensnitchd log was 32MB after a few days of running.. and that was not in debug mode. Can that be reduced?

Regarding this problem, I think I can reduce the amount of logs we write. Denied connections are logged with Warning level, regardless if it's a rule you have created on purpose or not. In the first case you may end up filling up the logs if its frequent, and it doesn't add value in this level.

Warning level should be for unexpected behaviour only. For example, rules created and answered automatically (i.e.: without user interaction).

We could also group logs of same type and write them down only every n seconds.

Ph0rk0z commented 4 years ago

Yes or ignore repeating messages. I fixed chrome to stop connecting so much and I set rotate daily 0. I get a opensnitch.log and opensnitchd.log.1, not sure anymore if its really cutting at my 2mb limit because I've drastically reduced the banned connections.

gustavo-iniguez-goya commented 4 years ago

I'll close this issue, but I'll go back to it from time to time because there has been discussed some good ideas

gustavo-iniguez-goya commented 4 years ago

hey @Ph0rk0z , would you mind to insert this rule and check if you get less pop-up dialogs of unknown connections? iptables -t mangle -I OUTPUT -m conntrack --ctstate RELATED -j NFQUEUE --queue-num 0

gustavo-iniguez-goya / opensnitch

Experience with "unknown" connections. #10