wfg / docker-openvpn-client

OpenVPN client with killswitch and proxy servers; built on Alpine
MIT License
353 stars 107 forks source link

Since v3 not working #66

Closed the-hotmann closed 2 years ago

the-hotmann commented 2 years ago

I today updated my VPN image to v3.0.0 (just relased) and my previous setup, which workled perfectly fine is now not working anymore.

There is nothing I changed. After reverting back to 2.1.1 everything works again.

Is there something I need to do when upgrading to v3.0.0?

Errorlos just shows this: config/nftables.conf:31:1-93: Error: Could not process rule: Invalid argument repeatingly

wfg commented 2 years ago

There shouldn't be anything needed. I probably just missed something when converting to nftables.

Can you run this and upload the file here?

docker cp openvpn-client:/data/config/nftables.conf .
the-hotmann commented 2 years ago

Thanks, that is probably the reason why the command nft -f config/nftables.conf utilizes 100% of my CPU and made the NAS unusable :P

8961 root 20 0 5.8m 3.1m 100.0 0.010 0:20.97 R nft -f config/nftables.conf (uses 100% of CPU)

Does this command need to be executed before or after the upgrade?

the-hotmann commented 2 years ago

here the output (when executed on v3):

nftables.conf.txt

Had to rename to .txt as .conf was not allowed

wfg commented 2 years ago

Thanks, so I just applied this exact config in my container and it works just fine.

Can you share your docker run command as well?

the-hotmann commented 2 years ago

I start the container with docker compose.

the compose file is this:

version: "3.8"
services:
  vpn:
    image: ghcr.io/wfg/openvpn-client:latest
    container_name: vpn
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun
    environment:
      - KILL_SWITCH=on
      - VPN_CONFIG_FILE=PerfectPrivacy/Frankfurt.ovpn
    volumes:
      - "/volume1/docker/config/vpn/:/data/vpn"
    restart: unless-stopped

On none of my Synos and servers the v3 is working as it throws the error mentioned above. But on one NAS (also Syno) it even makes it ununable.

wfg commented 2 years ago

Strange. That's basically exactly how I run it as well without issue.

Can you run this inside the container AND on the host itself and share the outputs?

lsmod | grep nf_tables
the-hotmann commented 2 years ago

Syno1

Host: -- empty --

Container v2.1.1: -- empty --

Container v3.0.0: (not executed, as otherwise NAS again unusable)

Syno2

Host: -- empty --

Container v2.1.1: -- empty --

Container v3.0.0: (not executed, as otherwise NAS again unusable)

Is it the wrong command or is "nothing" to be expected? Command for in container is for me like this:

docker exec -it vpn lsmod | grep nf_tables

mindset-tk commented 2 years ago

I am getting a similar error. config/nftables.conf:25:83-87: Error: syntax error, unexpected dport add rule inet killswitch outgoing oifname eth0 ip daddr [redacted] tcp-client dport [redacted] accept

The container restarts infinitely so I don't seem to be able to pull nf_tables.

In case it matters, my vpn.conf file looks like. As far as I understand this is a standard unified format, but please let me know if I futzed something.


proto tcp-client
remote [ip.v4.add.ress] [port]
dev tun
resolv-retry infinite
nobind
persist-key
persist-tun
remote-cert-tls server
verify-x509-name server_4FaNHU1cFptzTGUl name
auth SHA256
auth-nocache
cipher AES-128-GCM
tls-client
tls-version-min 1.2
tls-cipher TLS-ECDHE-ECDSA-WITH-AES-128-GCM-SHA256
ignore-unknown-option block-outside-dns
setenv opt block-outside-dns # Prevent Windows 10 DNS leak
verb 3
<ca>
-----BEGIN CERTIFICATE-----
the-hotmann commented 2 years ago

After going back to image: ghcr.io/wfg/openvpn-client:2.1.1 nothing works anymore. Even the old v2.1.1 not.

It now always says: wget: bad address or wget: error getting response: Connection reset by peer when I run: docker exec -it vpn wget -qO- URL_HERE

All this, while in the logs are no errors.

wfg commented 2 years ago

It’s the nftables change. It looks like the underlying hosts also need to be using nftables for the container to use it.

I’ll dig more into it soon. Until then, try a 2.x version.

Sent from my iPhone

On Jun 15, 2022, at 5:26 PM, Martin Hotmann @.***> wrote:

 After going back to image: ghcr.io/wfg/openvpn-client:2.1.1 nothing works anymore. Even the old v2.1.1 not.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

the-hotmann commented 2 years ago

I’ll dig more into it soon. Until then, try a 2.x version.

Thanks.

Do we get a "2" branche, so we can go with image: ghcr.io/wfg/openvpn-client:2 to always get the latest v2 if there will be any updates. General setting up the major versions as branches would be a good idea so when v4 comes out the folks who chose "3" will not automatically get updated to v4 and may got they setup broken.

BTW, working again after this command:

docker stop vpn && docker rm vpn then launch it again. Just in case someone ran into the same issue.

wfg commented 2 years ago

Do we get a "2" branche, so we can go with image: ghcr.io/wfg/openvpn-client:2 to always get the latest v2 if there will be any updates. General setting up the major versions as branches would be a good idea so when v4 comes out the folks who chose "3" will not automatically get updated to v4 and may got they setup broken.

Well I did bump the major version here (2 -> 3), which indicates there are breaking changes. I can't help if people use the tag latest ;)

You can see the existing tags here: https://github.com/wfg/docker-openvpn-client/pkgs/container/openvpn-client

the-hotmann commented 2 years ago

Aaaah and I always looked at the wrong place :P

mindset-tk commented 2 years ago

It’s the nftables change. It looks like the underlying hosts also need to be using nftables for the container to use it.

I’ll dig more into it soon. Until then, try a 2.x version.

Understandable, 2.1.1 working great here. I'll just update my compose to the effect for now.

wfg commented 2 years ago

@MartinHotmann @mindset-tk if either one of you could try 3.0.0 once more and send me the output of lsmod from inside the container, that would be helpful. And tell me what host OS and version you're running.

dngray commented 2 years ago

So I seem to be having some trouble with SUBNETS with the new 3.0.0 builds. My host is also alpine linux.

container logs reveals this error:

--- Running with the following variables ---
VPN configuration file: vpn_config.ovpn
Use default resolv.conf: on
Allowing subnets: 172.18.0.0/16
Kill switch: on
Using OpenVPN log level: 3
---

info: original configuration file: vpn/vpn_config.ovpn
info: modified configuration file: vpn/openvpn.NHJQuqYR.conf
info: kill switch is on
RTNETLINK answers: File exists

Finally it ends with:

Creating vpn_bittorrent ... done
Creating qbittorrent    ... error

ERROR: for qbittorrent  Cannot start service qbittorrent: Container c86109750892e8ba892a179ede48666d6fb36d79dff6589a0045cc6ce7132644 is restarting, wait until the container is running

ERROR: for qbittorrent  Cannot start service qbittorrent: Container c86109750892e8ba892a179ede48666d6fb36d79dff6589a0045cc6ce7132644 is restarting, wait until the container is running
ERROR: Encountered errors while bringing up the project.

My docker-compose.yml looks like this:

services:
  vpn_bittorrent:
    extends:
      file: ../vpn/container-compose.yml
      service: openvpn-client
    container_name: vpn_bittorrent
    volumes:
      - /mnt/data/container_data/vpn:/data/vpn
    ports:
      - xxxxxx:xxxxxx
      - yyyyyy:yyyyyy/udp
      - 8081:8081/tcp
    environment:
      - SUBNETS=172.18.0.0/16
      - KILL_SWITCH=on
      - VPN_CONFIG_FILE=vpn_config
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.qbittorrent.tls=true"
      - "traefik.http.routers.qbittorrent.entrypoints=websecure"
      - "traefik.http.routers.qbittorrent.rule=Host(`qbittorrent.$MY_DOMAIN`)"

  qbittorrent:
    image: ghcr.io/linuxserver/qbittorrent
    container_name: qbittorrent
    network_mode: service:vpn_bittorrent
    environment:
      - PUID=1003
      - PGID=1004
      - WEBUI_PORT=8081
    volumes:
      - /mnt/data/container_data/qbittorrent:/config
      - /mnt/data/shared/incoming:/mnt/shared/incoming
    restart: unless-stopped

networks:
  default:
    external:
      name: $DEFAULT_NETWORK

The common part of the container is just:

services:
  openvpn-client:
    image: ghcr.io/wfg/openvpn-client
    #build:
    #  context: ./docker-openvpn-client
    #  dockerfile: Dockerfile
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun
    restart: unless-stopped

ghcr.io/wfg/openvpn-client:2.1.1 works for me.

adr3nal1n commented 2 years ago

I am using docker version 20.10.17 and the host OS is Debian v11 Bullseye.

I had the same issue with v3 and the container just kept restarting and I found lots of copies of .conf files (owned by root) in the config folder on my host that were time stamped from the various attempts. Normally on v2.1.1 it would copy my openvpn.conf file and create a file called openvpn.conf.modified

I then did a docker stop openvpn-client, followed by a docker rm openvpn-client and finally I deleted the image using docker rmi ghcr.io/wfg/openvpn-client:latest

I then modified my compose file to point to config 2.1.1 and then rebuilt the container and v2.1.1 worked as before.

I then stopped the v2.1.1 container, deleted it and the image, modified the compose file to point to v3 and then rebuilt the container again. This time the v3 container came up first time and it copied my openvpn.conf file to a file called openvpn.3uqHskTH.conf (owned by root)

Output from lsmod inside the v3 container is: lsmod_output_from_v3_container.txt

Logfile for the v3_container is: v3_container_log.txt

Hope the above information helps in some way. I'll keep running the v3 container for now and let you know if I notice anything odd.

lkracon commented 2 years ago

Similar problem here. After docker start I'm getting following errors:

proxy_3_1  | config/nftables.conf:21:57-69: Error: Hostname resolves to multiple addresses
proxy_3_1  | add rule inet killswitch outgoing oifname eth0 ip daddr nl.prcdn.net. udp dport 553 accept
proxy_3_1  |                                                         ^^^^^^^^^^^^^
proxy_3_1  | config/nftables.conf:23:57-73: Error: Hostname resolves to multiple addresses
proxy_3_1  | add rule inet killswitch outgoing oifname eth0 ip daddr ams-nl.prcdn.net. udp dport 553 accept
proxy_3_1  |                                                         ^^^^^^^^^^^^^^^^^

Going back to v 2.1.1 solves the problem.

adr3nal1n commented 2 years ago

V3 is still working and running healthy for the last 5 days,

If you are having issues, maybe try the following as it worked for me:

  1. docker stop openvpn-client
  2. docker rm openvpn-client
  3. docker rmi ghcr.io/wfg/openvpn-client:latest
  4. Recreate the openvpn-client container
mindset-tk commented 2 years ago

@MartinHotmann @mindset-tk if either one of you could try 3.0.0 once more and send me the output of lsmod from inside the container, that would be helpful. And tell me what host OS and version you're running.

Sorry for the delay. I had to do something a little hacky to make this work, since the container wanted to reboot infinitely and prevented me from running lsmod.

I set it to run the following script at startup instead of the standard entrypoint. all it does is force the container to stay running after completing the standard entry.sh that it normally calls.

#!/bin/sh
scripts/entry.sh
while true; do sleep 1; done

then I was able to get in and run lsmod: lsmod.txt

mindset-tk commented 2 years ago

V3 is still working and running healthy for the last 5 days,

If you are having issues, maybe try the following as it worked for me:

1. docker stop openvpn-client

2. docker rm openvpn-client

3. docker rmi ghcr.io/wfg/openvpn-client:latest

4. Recreate the openvpn-client container

I tried this, no success for me.

dngray commented 2 years ago

There defintely seems to be an issue.

In this case I did a docker system prune -a and pruned back to nothing.

I deleted all the VPN configs, and tried again:

doas docker-compose up

...

Status: Downloaded newer image for ghcr.io/linuxserver/qbittorrent:latest
Creating vpn_bittorrent ... done
Creating qbittorrent    ... error

ERROR: for qbittorrent  Cannot start service qbittorrent: Container 7dd9ebe1e952dfe5bb46c9907fd29b496985c80b02b1534b76540dc2283b27f8 is restarting, wait until the container is running

ERROR: for qbittorrent  Cannot start service qbittorrent: Container 7dd9ebe1e952dfe5bb46c9907fd29b496985c80b02b1534b76540dc2283b27f8 is restarting, wait until the container is running
ERROR: Encountered errors while bringing up the project.

I tried this with a simpler setup:

services:
  vpn_test:
    extends:
      file: ../vpn/container-compose.yml
      service: openvpn-client
    container_name: vpn_test
    volumes:
      - /mnt/data/container_data/vpn:/data/vpn
    ports:
      - 8123:8123/tcp
    environment:
      - KILL_SWITCH=on
      - SUBNETS=172.18.0.0/16
      - VPN_CONFIG_FILE=vpn_config_name
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.test.tls=true"
      - "traefik.http.routers.test.entrypoints=websecure"
      - "traefik.http.routers.test.rule=Host(`test.$MY_DOMAIN`)"

  test:
    image: busybox
    container_name: test
    network_mode: service:vpn_test
    command: httpd -f -p 8123 -h /etc/

networks:
  default:
    external:
      name: $DEFAULT_NETWORK

Things seemed to work in that.

vpn_test    | --- Running with the following variables ---
vpn_test    | VPN configuration file: vpn_config_name
vpn_test    | Use default resolv.conf: on
vpn_test    | Allowing subnets: 172.18.0.0/16
vpn_test    | Kill switch: on
vpn_test    | Using OpenVPN log level: 3
vpn_test    | ---
vpn_test    | 
vpn_test    | info: original configuration file: vpn/vpn_config_name
vpn_test    | info: modified configuration file: vpn/openvpn.JkFtDPdc.conf
vpn_test    | info: kill switch is on
...
dngray commented 2 years ago

So I removed the qbittorrent section, did a system prune and just had:

services:
  vpn_bittorrent:
    extends:
      file: ../vpn/container-compose.yml
      service: openvpn-client
    container_name: vpn_bittorrent
    volumes:
      - /mnt/data/container_data/vpn:/data/vpn
    ports:
      - 26129:26129
      - 26129:26129/udp
      - 8081:8081/tcp
    environment:
      - KILL_SWITCH=on
      - SUBNETS=172.18.0.0/16
      - VPN_CONFIG_FILE=vpn_config_name
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.qbittorrent.tls=true"
      - "traefik.http.routers.qbittorrent.entrypoints=websecure"
      - "traefik.http.routers.qbittorrent.rule=Host(`qbittorrent.$MY_DOMAIN`)"

When starting I get:

Creating vpn_bittorrent ... done
Attaching to vpn_bittorrent
vpn_bittorrent    | 
vpn_bittorrent    | --- Running with the following variables ---
vpn_bittorrent    | VPN configuration file: vpn_config_name
vpn_bittorrent    | Use default resolv.conf: on
vpn_bittorrent    | Allowing subnets: 172.18.0.0/16
vpn_bittorrent    | Kill switch: on
vpn_bittorrent    | Using OpenVPN log level: 3
vpn_bittorrent    | ---
vpn_bittorrent    | 
vpn_bittorrent    | info: original configuration file: vpn/vpn_config_name
vpn_bittorrent    | info: modified configuration file: vpn/openvpn.a7i9rIWP.conf
vpn_bittorrent    | info: kill switch is on
vpn_bittorrent    | RTNETLINK answers: File exists

In between these tests I did a system prune -a Using the same config, I made sure to clear out the openvpn.*.conf files.

Obviously wiping everything and going back to 2.1.1 everything works as it did.

wfg commented 2 years ago

Hi everyone, v3 seems to only work if your underlying host also uses nftables instead of iptables (but somehow iptables works even if your underlying host is using nftables?). I've been busy, so I haven't been able to work on a fix.

I will try to spend time on it at the end of this week. In the meantime, stick with v2. And stop using latest ;)

wfg commented 2 years ago

Can someone that has issues with v3.0.0 please build the latest commit and test?

The build script can be used like this:

./build.py 3.1.0-test

which will build the image ghcr.io/wfg/openvpn-client:3.1.0-test.

dngray commented 2 years ago

Hi everyone, v3 seems to only work if your underlying host also uses nftables instead of iptables (but somehow iptables works even if your underlying host is using nftables?). I've been busy, so I haven't been able to work on a fix.

Yes my underlying system does use iptables, the reason for that is because of https://github.com/moby/moby/issues/26824 - docker still doesn't have native nftables support.

I also haven't figured out a way to make alpine linux use iptables-nft by default. Nobody seemed to know the answer.

Maybe now would be a good time to switch to podman as that has it.

Can someone that has issues with v3.0.0 please build the latest commit and test?

Can do.

wfg commented 2 years ago

Hi everyone, v3 seems to only work if your underlying host also uses nftables instead of iptables (but somehow iptables works even if your underlying host is using nftables?). I've been busy, so I haven't been able to work on a fix.

Yes my underlying system does use iptables, the reason for that is because of moby/moby#26824 - docker still doesn't have native nftables support.

I also haven't figured out a way to make alpine linux use iptables-nft by default. Nobody seemed to know the answer.

Maybe now would be a good time to switch to podman as that has it.

My Fedora server uses iptables-nft which provides the iptables command on my host as a wrapper around nftables. My server doesn't use actually use iptables. See my output below:

$ iptables -V
iptables v1.8.7 (nf_tables)

(iptables is using the nf_tables module)

This also explains how iptables works for me even though my underlying host is running nftables (which I called out in an above comment).

You can read more about iptables-nft here: https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables

Can someone that has issues with v3.0.0 please build the latest commit and test?

Can do.

v3.1.0 has been pushed, so you can just pull it. You don't have to build now.

afladmark commented 2 years ago

I can confirm 3.1.0 is now working for me on Synology.

dngray commented 2 years ago

My Fedora server uses iptables-nft which provides the iptables command on my host as a wrapper around nftables. My server doesn't use actually use iptables. See my output below:

$ iptables -V
iptables v1.8.7 (nf_tables)

(iptables is using the nf_tables module)

Yeah this is what Debian does, and you can change it with the update-alternatives command. Alpine Linux doesn't seem to have a way to update the wrapper. I got no replies on the mailing list and I've asked in the IRC channel a few times.

I'm actually thinking of retiring Alpine Linux and using Proxmox instead as my host OS. Can then run a VM with any OS I like.

I would have chosen XCP-ng but they don't yet support encrypted zfs nor does it natively support clone, destroy, snapshot and replicate features of ZFS.

wfg commented 2 years ago

I'm confident this has been resolved in v3.1.0, so I'm going to close.