wfg / docker-openvpn-client

OpenVPN client with killswitch and proxy servers; built on Alpine
MIT License
353 stars 107 forks source link

IP & DNS ptr leaked #76

Open the-hotmann opened 2 years ago

the-hotmann commented 2 years ago

Just run a bunch of tests with some friends of mine and it turns out they are able to detect and provide me with my DNS ptr on the first boot. At later reboots (docker restart [container]) this does not seem to happen anymore.

The PTR was correct and directly points to my server (ofc it does it is an IP reverse lookup). But this also means they had my servers IP. Apparently I can't say if this is true for IPv6 aswell since I have turned it off for now.

But that is not good. I also can't say if it is a docker core problem or the problem of this image. But what I can expose is my censored docker-compose.yml:

version: "3"
services:
  vpnFRA:
    image: ghcr.io/wfg/openvpn-client:latest
    container_name: vpnFRA
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun
    environment:
      - KILL_SWITCH=on
      - VPN_CONFIG_FILE=Frankfurt.ovpn
      - LANG=de_DE.UTF-8
    ports:
      - 127.0.0.1:5805:5805
    volumes:
      - "/data/vpn:/data/vpn"
    sysctls:
      - net.ipv6.conf.all.disable_ipv6=0
    restart: unless-stopped

  app1:
    image: [CENSORED]
    container_name: app1
    network_mode: "container:vpnFRA"
    environment:
      - PUID=10004
      - PGID=33
      - TZ=Europe/Berlin
      - LANG=de_DE.UTF-8
    volumes:
      - "/config:/config:rw"
      - "/storage:rw:/storage:rw"
    restart: unless-stopped
    depends_on:
      - vpnFRA

KILL_SWITCH=on should have prevented ALL IP leaks, but it did not. But what I noticed on the first docker-compose up -d is this:

the container app1 starts first, then the VPN container and then app1. Anyway this should have never had any internet access as it's network_mode (container:vpnFRA) was not fully booted yet?

The DNS PTR is automatically detected (if detectable) and I was this multiple times and don't like that others can track my server behind a dockerized VPN container. I am pretty sure I have not seen this before with v2, but since some things changed in the meantime, I can not tell for sure where the culprit is. But at first boot something happens that exposes (or gives the chance of exposing the servers IP!).

Hope to hear back :)

wfg commented 2 years ago

Thanks for bringing this up @MartinHotmann. It is important to me that my image does what it advertises at the very least. :)

To try to narrow down possible causes, can you run a similar test where vpnFRA is already running and then you start app1?

the-hotmann commented 2 years ago

I will try to do and ask them if it then still exposes the IP.

But since it is set as "depends_on" it should just start after vpnFRA is running. But maybe that is a docker issue? Will do some more tests and come back.

the-hotmann commented 2 years ago

can you run a similar test where vpnFRA is already running and then you start app1?

It was not reproducable, when the vpnFRA was running already. Just on first creation time. So if you "docker stop vpnFRA && docker rm vpnFRA", then it will be reproducable.

twiebe commented 2 years ago

I also encountered potential leaking during testing and traced it back to the killswitch setup procedure, which resolves remotes ony by one, before dropping traffic through the insecure eth0 interface.

In my case the list of remotes was quite long and dns resolving took a few seconds per remote (for unrelated reasons), so the time period in which other containers were communicating insecurely was as long as a few minutes.

I made a fix in a fork here which I'm currently testing. The fix drops all traffic that is not explicitly allowed immediately, before performing the dns lookups. To still allow the lookups to function in any potential container dns configuration (ie. for external dns resolvers), iptables allow rules are added temporarily until the lookups are done.

Looks promising so far for what I encountered. Not sure if it addresses your problem @MartinHotmann, but if you want to check it out, you can either build your own image from that code or use the build here.

The fix is only implemented for iptables so far. If it proves to solve the issue, I'll open a PR and perhaps look into how to solve it for nftables, too.

wfg commented 2 years ago

...traced it back to the killswitch setup procedure, which resolves remotes ony by one, before dropping traffic through the insecure eth0 interface.

In my case the list of remotes was quite long and dns resolving took a few seconds per remote (for unrelated reasons), so the time period in which other containers were communicating insecurely was as long as a few minutes. ...

When I read what @MartinHotmann wrote, I had a feeling this was what was going on here. Have you been using it with this change over the past few days? Does it seem to be working as expected?

the-hotmann commented 2 years ago

@twiebe servus Thomas und danke für den QuickFix!

I will perform some tests the next days and will report back if it entirely fixed the problem for me or not.

Thanks in advance!

twiebe commented 2 years ago

When I read what @MartinHotmann wrote, I had a feeling this was what was going on here. Have you been using it with this change over the past few days? Does it seem to be working as expected?

Yep, it's been running like that for a few days and all manual checks look good so far. However, since I'm traveling at the moment, I have only found little time to do testing. Will do some more on the weekend once I'm back and open the PR, even if just for iptables yet.

BTW: This morning I've made a small adjustment to not bind the temporary resolver rules to eth0, but instead to all interfaces. Custom bridge docker network (can) have 127.0.0.0/8 ip addresses on the lo interface as resolvers. (though technically it already worked before, since the lo interface has the ACCEPT rules in both directions)

twiebe commented 2 years ago

Did some further testing and everything worked as expected. Here's the PR for the iptables fix: https://github.com/wfg/docker-openvpn-client/pull/80

Hope that it fixes your issues, too, @MartinHotmann. ✌️

the-hotmann commented 1 year ago

Seems like it is fixed. At least I was not able to reproduce it anymore.

twiebe commented 1 year ago

Excellent. Glad it helped!

the-hotmann commented 1 year ago

Dunno if related to this here, but ATM the container does not start and completely leaks all and everything. Running the latest version.

It litterally stuck here:

2022-09-24 16:53:14 UDP link remote: [AF_INET][######]:1148
2022-09-24 16:53:14 TLS: Initial packet from [AF_INET][######]:1148, sid=93dd4a3a 63503bd0
2022-09-24 16:53:14 VERIFY OK: depth=1, [######]
2022-09-24 16:53:14 VERIFY OK: nsCertType=SERVER
2022-09-24 16:53:14 VERIFY OK: depth=0, [######]

Btw, also found this: 2022-09-24 16:40:08 library versions: OpenSSL 1.1.1o 3 May 2022, LZO 2.10 Maybe should be a different issue, but updating to the latest version (openssl-1.1.1q or openssl-3.0.5) would also be good.

Here my composer file:

  vpn:
    image: ghcr.io/wfg/openvpn-client:latest
    container_name: vpn
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun
    environment:
      - KILL_SWITCH=on
      - VPN_CONFIG_FILE=Frankfurt.ovpn
      - LANG=de_DE.UTF-8
    volumes:
      - "/var/docker/config/vpn/:/data/vpn"
    sysctls:
      - net.ipv6.conf.all.disable_ipv6=0
    restart: unless-stopped

It does not start properly and also does not throw any error in the docker logs. It allows every connection to be accessed by everyone and also exposed all and everything. Something must be wrong.

@wfg

wfg commented 1 year ago

I just tested latest with this Compose file and everything works as expected

services:
  vpn:
    image: ghcr.io/wfg/openvpn-client:latest
    container_name: ovpn-test
    cap_add:
    - NET_ADMIN
    devices:
    - /dev/net/tun:/dev/net/tun
    environment:
    - SUBNETS=192.168.10.0/24
    volumes:
     - ./.local/vpn:/data/vpn
the-hotmann commented 1 year ago

Thanks for the reply! I will again dig into and see what I can do.

wfg commented 1 year ago

Unrelated, but given that sysctl, you're using ipv6. Does this image work well with that?

the-hotmann commented 1 year ago

I just founf out what it was.

The OVPN files from the provider have been updated and the the old ones have not been valid anymore. At the next restart of the container no connection was established.

For now the problem is fixed by updating the files and then restarting the container.

But the container really should not allow any connection without having a VPN connection established. Guess he somewhere fails silently and sallows connected containers to be reached. Dont know if @twiebe's PR here https://github.com/wfg/docker-openvpn-client/pull/80 could have fixed it or not, as I again am on the official version.

given that sysctl, you're using ipv6. Does this image work well with that?

No no, my server does have a IPv6, but the sysctl actually does disable it on docker level, so the container will not see any IPv6. I have not tested it with IPv6.

wfg commented 1 year ago

@MartinHotmann I have something for you to test if you're willing.

  1. Build the rewrite branch using the build directory as the context.
    docker build -t ovpn-test https://github.com/wfg/docker-openvpn-client.git#rewrite:build
  2. Test with a Compose file similar to the following:
    services:
      ovpn-test:
        image: ovpn-test
        container_name: ovpn-test
        cap_add:
        - NET_ADMIN
        devices:
        - /dev/net/tun:/dev/net/tun
        environment:
        - ALLOWED_SUBNETS=192.168.10.0/24
        volumes:
        - ./local/vpn:/data/vpn

I'd like to see how it behaves with the old config and with the new config to see if either leaks.

the-hotmann commented 1 year ago

Thanks, will test the next days!

wfg commented 1 year ago

@MartinHotmann were you able to test?

the-hotmann commented 1 year ago

Sadly not yet, I am super busy ATM. I will report back, as soon as I did it.

JenswBE commented 1 year ago

@wfg I just did some tests to validate your fix for https://github.com/wfg/docker-openvpn-client/issues/84 and hit following issues:

  1. Seems default config path changed from /data/vpn to /config, but not mentioned in the README.
  2. Issues which seem Mullvad specific (so mainly FYI and just for reference):
    1. Seems Mullvad pushes VPN to ipv6, but this isn't enabled on my Docker setup. Related logs:
      2022-10-18 18:31:21 PUSH: Received control message: 'PUSH_REPLY,dhcp-option DNS 10.9.0.1,redirect-gateway def1 bypass-dhcp,route-ipv6 0000::/2,route-ipv6 4000::/2,route-ipv6 8000::/2,route-ipv6 C000::/2,comp-lzo no,route-gateway 10.9.0.1,topology subnet,socket-flags TCP_NODELAY,ifconfig-ipv6 fdda:d0d0:cafe:1195::1004/64 fdda:d0d0:cafe:1195::,ifconfig 10.9.0.6 255.255.0.0,peer-id 4,cipher AES-256-GCM'
      ...
      2022-10-18 18:31:21 GDG6: remote_host_ipv6=n/a
      2022-10-18 18:31:21 net_route_v6_best_gw query: dst ::
      2022-10-18 18:31:21 sitnl_send: rtnl: generic error (-101): Network unreachable
      2022-10-18 18:31:21 ROUTE6: default_gateway=UNDEF
      2022-10-18 18:31:21 TUN/TAP device tun0 opened
      2022-10-18 18:31:21 /sbin/ip link set dev tun0 up mtu 1500
      2022-10-18 18:31:21 /sbin/ip link set dev tun0 up
      2022-10-18 18:31:21 /sbin/ip addr add dev tun0 10.9.0.6/16
      2022-10-18 18:31:21 /sbin/ip link set dev tun0 up mtu 1500
      2022-10-18 18:31:21 /sbin/ip link set dev tun0 up
      2022-10-18 18:31:21 /sbin/ip -6 addr add fdda:d0d0:cafe:1195::1004/64 dev tun0
      RTNETLINK answers: Permission denied
      2022-10-18 18:31:21 Linux ip -6 addr add failed: external program exited with error status: 2
      2022-10-18 18:31:21 Exiting due to fatal error

      Resolved by reading section Troubleshooting at https://mullvad.net/en/help/linux-openvpn-installation/

    2. Mullvad has following up and down options in the generated config:
      up /etc/openvpn/update-resolv-conf
      down /etc/openvpn/update-resolv-conf

      This triggers an error the script is not found as it's in /config, not in /etc/openvpn/. This still worked with the stable branch, but I seem to recall I saw you dropped the up and down options in the stable branch.

Except for above (mainly Mullvad specific) errors, seems to work fine to me. Just be aware I only validated VPN is starting and traffic is pushed through the VPN. I didn't validate the fix in regards of IP and DNS ptr leaking due to lack of knowledge on the exact issue.

wfg commented 1 year ago

@JenswBE thanks for the help!

  1. Yes, you're right. I'll update the README.md before merging to main.
  2. A lot of this stuff was handled in the entrypoint script (example 1, 2). I wanted to rewrite it to be less "opinionated" and let users add this to their configuration files only if they wanted/needed.
the-hotmann commented 1 year ago

I seem to have found the problem after a long time. The environment variables were changed, but I did not update them.

I used KILL_SWITCH=on, which was not a valid option at that time. According to the documentation, any wrong value will disable the killswitch, which I think is highly dangerous.

A VPN without a killswitch doesn't make much sense as a VPN is about security. Without a killswitch, most people would agree that a VPN would be less secure. Please implement the killswitch in the following way:

ENV:

KILL_SWITCH

VALUES:

truthy (automatically detects the program to use - this value should be the default! What programm it uses by default is up to you) falsy nftables (this is a preference, iptables will be used for fallback. Basically the same as truthy, just with a preference) iptables (this is a preference, nftables will be used for fallback. Basically the same as truthy, just with a preference) All values should be checked case-insensitive in case of typos. Any invalid value should fallback to the default (truthy).

This way, all existing containers will behave as expected. This method also ensures that no "wrong" value can be provided, and a killswitch is always enabled until explicitly disabled.

A log entry, that clearly states that the killswitch is enabled would be really appreciated. As for now I can't really test it myself. Log could look something like this:

In case of truthy:

Killswitch = On (auto - nftables)
Killswitch = On (auto - iptables)

In case of falsy:

Killswitch = Off

In case of nftables:

Killswitch = On (nftables)

In case of iptables:

Killswitch = On (iptables)

In case of any ivalid value:

Killswitch = On (auto - nftables | invalid_value! )

Thanks for your work and would be nice to have some feedback on this suggestion.