qdm12 / gluetun

VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in.
https://hub.docker.com/r/qmcgaw/gluetun
MIT License
8.07k stars 373 forks source link

Bug: Wireguard userspace high CPU usage #1795

Open OrpheeGT opened 1 year ago

OrpheeGT commented 1 year ago

Is this urgent?

No

Host OS

Synology docker

CPU arch

x86_64

VPN service provider

ProtonVPN

What are you using to run the container

docker-compose

What is the version of Gluetun

ghcr.io/qdm12/gluetun:latest

What's the problem πŸ€”

Hello, Gluetun containers CPU usage raise when qBittorrent torrent is seeding (500 seeding, 20 real actives, 10MB/s upload)

I'm actually using this docker service configuration : https://github.com/soxfor/qbittorrent-natmap

image

image

image

Share your logs

========================================
========================================
=============== gluetun ================
========================================
=========== Made with ❀️ by ============
======= https://github.com/qdm12 =======
========================================
========================================

Running version latest built on 2023-08-04T11:14:39.159Z (commit 082a38b)

πŸ”§ Need help? https://github.com/qdm12/gluetun/discussions/new
πŸ› Bug? https://github.com/qdm12/gluetun/issues/new
✨ New feature? https://github.com/qdm12/gluetun/issues/new
β˜• Discussion? https://github.com/qdm12/gluetun/discussions/new
πŸ’» Email? quentin.mcgaw@gmail.com
πŸ’° Help me? https://www.paypal.me/qmcgaw https://github.com/sponsors/qdm12
2023-08-11T22:03:58+02:00 INFO [routing] default route found: interface eth0, gateway 172.27.0.1, assigned IP 172.27.0.2 and family v4
2023-08-11T22:03:58+02:00 INFO [routing] local ethernet link found: eth0
2023-08-11T22:03:58+02:00 INFO [routing] local ipnet found: 172.27.0.0/16
2023-08-11T22:03:58+02:00 INFO [firewall] enabling...
2023-08-11T22:03:58+02:00 INFO [firewall] enabled successfully
2023-08-11T22:03:59+02:00 INFO [storage] merging by most recent 17692 hardcoded servers and 17692 servers read from /gluetun/servers.json
2023-08-11T22:03:59+02:00 INFO Alpine version: 3.18.2
2023-08-11T22:03:59+02:00 INFO OpenVPN 2.5 version: 2.5.8
2023-08-11T22:03:59+02:00 INFO OpenVPN 2.6 version: 2.6.5
2023-08-11T22:03:59+02:00 INFO Unbound version: 1.17.1
2023-08-11T22:03:59+02:00 INFO IPtables version: v1.8.9
2023-08-11T22:03:59+02:00 INFO Settings summary:
β”œβ”€β”€ VPN settings:
|   β”œβ”€β”€ VPN provider settings:
|   |   β”œβ”€β”€ Name: custom
|   |   └── Server selection settings:
|   |       β”œβ”€β”€ VPN type: wireguard
|   |       β”œβ”€β”€ Target IP address: [Retracted]
|   |       └── Wireguard selection settings:
|   |           β”œβ”€β”€ Endpoint IP address: [Retracted]
|   |           β”œβ”€β”€ Endpoint port: 51820
|   |           └── Server public key: [Retracted]
|   └── Wireguard settings:
|       β”œβ”€β”€ Private key: ING...Fo=
|       β”œβ”€β”€ Interface addresses:
|       |   └── 10.2.0.2/32
|       β”œβ”€β”€ Allowed IPs:
|       |   β”œβ”€β”€ 0.0.0.0/0
|       |   └── ::/0
|       └── Network interface: tun0
|           └── MTU: 1400
β”œβ”€β”€ DNS settings:
|   β”œβ”€β”€ DNS server address to use: 127.0.0.1
|   β”œβ”€β”€ Keep existing nameserver(s): no
|   └── DNS over TLS settings:
|       └── Enabled: no
β”œβ”€β”€ Firewall settings:
|   └── Enabled: yes
β”œβ”€β”€ Log settings:
|   └── Log level: INFO
β”œβ”€β”€ Health settings:
|   β”œβ”€β”€ Server listening address: 127.0.0.1:9999
|   β”œβ”€β”€ Target address: cloudflare.com:443
|   β”œβ”€β”€ Duration to wait after success: 5s
|   β”œβ”€β”€ Read header timeout: 100ms
|   β”œβ”€β”€ Read timeout: 500ms
|   └── VPN wait durations:
|       β”œβ”€β”€ Initial duration: 6s
|       └── Additional duration: 5s
β”œβ”€β”€ Shadowsocks server settings:
|   └── Enabled: no
β”œβ”€β”€ HTTP proxy settings:
|   └── Enabled: no
β”œβ”€β”€ Control server settings:
|   β”œβ”€β”€ Listening address: :8000
|   └── Logging: yes
β”œβ”€β”€ OS Alpine settings:
|   β”œβ”€β”€ Process UID: 1026
|   β”œβ”€β”€ Process GID: 100
|   └── Timezone: europe/paris
β”œβ”€β”€ Public IP settings:
|   β”œβ”€β”€ Fetching: every 12h0m0s
|   └── IP file path: /tmp/gluetun/ip
└── Version settings:
    └── Enabled: yes
2023-08-11T22:03:59+02:00 INFO [routing] default route found: interface eth0, gateway 172.27.0.1, assigned IP 172.27.0.2 and family v4
2023-08-11T22:03:59+02:00 INFO [routing] adding route for 0.0.0.0/0
2023-08-11T22:03:59+02:00 INFO [firewall] setting allowed subnets...
2023-08-11T22:03:59+02:00 INFO [routing] default route found: interface eth0, gateway 172.27.0.1, assigned IP 172.27.0.2 and family v4
2023-08-11T22:03:59+02:00 INFO [dns over tls] using plaintext DNS at address 1.1.1.1
2023-08-11T22:03:59+02:00 INFO [http server] http server listening on [::]:8000
2023-08-11T22:03:59+02:00 INFO [firewall] allowing VPN connection...
2023-08-11T22:03:59+02:00 INFO [healthcheck] listening on 127.0.0.1:9999
2023-08-11T22:03:59+02:00 INFO [wireguard] Using userspace implementation since Kernel support does not exist
2023-08-11T22:03:59+02:00 INFO [wireguard] Connecting to [Retracted]:51820
2023-08-11T22:03:59+02:00 INFO [wireguard] Wireguard setup is complete. Note Wireguard is a silent protocol and it may or may not work, without giving any error message. Typically i/o timeout errors indicate 
the Wireguard connection is not working.
2023-08-11T22:04:00+02:00 INFO [vpn] You are running 1 commit behind the most recent latest
2023-08-11T22:04:00+02:00 INFO [ip getter] Public IP address is [Retracted]
2023-08-11T22:04:00+02:00 INFO [healthcheck] healthy!

Share your configuration

---
services:
  gluetun:
    # https://github.com/qdm12/gluetun
    image: ghcr.io/qdm12/gluetun:latest
    container_name: gluetun
    # line above must be uncommented to allow external containers to connect. See https://github.com/qdm12/gluetun/wiki/Connect-a-container-to-gluetun#external-container-to-gluetun
    restart: unless-stopped
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun:/dev/net/tun
    volumes:
      - /volume1/docker/gluetun:/gluetun
    environment:
      # See https://github.com/qdm12/gluetun/wiki
      ## ProtonVPN Wireguard
      - VPN_SERVICE_PROVIDER=custom
      - VPN_TYPE=wireguard
      - VPN_ENDPOINT_IP=${VPNIP}
      - VPN_ENDPOINT_PORT=51820
      - WIREGUARD_PUBLIC_KEY=${PUBKEY}
      - WIREGUARD_PRIVATE_KEY=${PRIVKEY}
      - WIREGUARD_ADDRESSES=10.2.0.2/32
      # Timezone for accurate log times
      - TZ=Europe/Paris
      - PUID=1026
      - PGID=100
      # Server list updater. See https://github.com/qdm12/gluetun/wiki/Updating-Servers#periodic-update
      - UPDATER_PERIOD=
      - UPDATER_VPN_SERVICE_PROVIDERS=
      # If QBITTORRENT_SERVER address is not related to VPN_IF_NAME (default: tun0) you'll need to set the variable below
      # - FIREWALL_OUTBOUND_SUBNETS=172.16.0.0/24
    ports:
      # - 8888:8888/tcp # HTTP proxy
      # - 8388:8388/tcp # Shadowsocks
      # - 8388:8388/udp # Shadowsocks
      - 8080:8080/tcp # qBittorrent
    # networks:
    #   gluetun-network:
    #     ipv4_address: 172.16.0.10

  qbittorrent:
    # https://docs.linuxserver.io/images/docker-qbittorrent
    image: lscr.io/linuxserver/qbittorrent:latest
    container_name: qbittorrent
    restart: unless-stopped
    volumes:
      - /volume1/docker/config:/config
      - /volume1/torrents:/downloads
    environment:
      - TZ=Europe/Paris
      - PUID=1026
      - PGID=100
    network_mode: "service:gluetun"
    depends_on:
      gluetun:
        condition: service_healthy

  qbittorrent-natmap:
    # https://github.com/soxfor/qbittorrent-natmap
    image: ghcr.io/soxfor/qbittorrent-natmap:latest
    container_name: qbittorrent-natmap
    restart: unless-stopped
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    environment:
      - TZ=Europe/Paris
      - QBITTORRENT_SERVER=10.2.0.2
      - QBITTORRENT_PORT=8080
      - QBITTORRENT_USER=[Retracted]
      - QBITTORRENT_PASS=[Retracted]
      - VPN_GATEWAY=10.2.0.1
      # - VPN_CT_NAME=gluetun
      # - VPN_IF_NAME=tun0
      #- CHECK_INTERVAL=120
      # - NAT_LEASE_LIFETIME=240
    network_mode: "service:gluetun"
    depends_on:
      qbittorrent:
        condition: service_started
      gluetun:
        condition: service_healthy

#networks:
#  gluetun-network:
#    driver: bridge
#    ipam:
#      config:
#        - subnet: 172.16.0.0/24
#          gateway: 172.16.0.254
Cyph3r commented 1 year ago

Same issue for me. Gluetun is making my NAS go from 12% CPU to 90%, continuously.

qdm12 commented 1 year ago

If you're seeding using the userspace implementation, that's just the Wireguard code running (within the gluetun-entrypoint process).

If you want to dig further in what is using all this CPU, try https://github.com/qdm12/gluetun-wiki/blob/main/contributing/profiling.md It's relatively easy to setup and fun to visualize, although note it would only show vpn profiling for Wireguard in userspace (the case for @OrpheeGT at least).

I'll keep the issue opened for a few days, in case one of you wants to post a screenshot of cpu usage. I might even 'steal' it and put in the wiki faq for other users πŸ˜‰

Assuming this is Wireguard just going as fast as possible and you want to lower its cpu usage at the cost of reduced bandwidth, you can use cpulimit on the gluetun-entrypoint process from your host. Maybe you can do it with docker/docker-compose but as far as I know, you could only do it with kubernetes back then.

Cyph3r commented 1 year ago

@qdm12 Let me first say thank you for your great work on this project. Since you asked for more info, here it is.

Per your request for a CPU screenshot: image

Here is Grafana showing the compose stack running containers for gluetun and qbittorrent, and then stopped for a comparison: image

In this example, I'm using ipvanish VPN and torrents are set to unlimited download speed (was going about 12MB/s) and 30KB/s capped upload. OpenVPN was selected, so I'm not sure Wireguard was even in use??

Here is the compose I used, note openvpn selected: image

Switching from gluetun/qbit to image: binhex/arch-qbittorrentvpn reduced CPU from 70-90% to 15-25%.

Please let me know if I did anything incorrectly.

OrpheeGT commented 1 year ago

Hello !

Thank you for your help and answer !

So your message helped to understand this notion of "userspace implementation" of Wireguard.

I'm running it on Synology... I understood wireguard kernel module was actually missing. So I found the following docker project : https://hub.docker.com/r/blackvoidclub/synobuild72?ref=blackvoid.club

I built this package, and extracted the wireguard.ko from it.

Loaded it (with insmod) and then (re)started gluetun docker container.

========================================
========================================
=============== gluetun ================
========================================
=========== Made with ❀️ by ============
======= https://github.com/qdm12 =======
========================================
========================================

Running version latest built on 2023-08-11T11:08:54.752Z (commit e556871)

πŸ”§ Need help? https://github.com/qdm12/gluetun/discussions/new
πŸ› Bug? https://github.com/qdm12/gluetun/issues/new
✨ New feature? https://github.com/qdm12/gluetun/issues/new
β˜• Discussion? https://github.com/qdm12/gluetun/discussions/new
πŸ’» Email? quentin.mcgaw@gmail.com
πŸ’° Help me? https://www.paypal.me/qmcgaw https://github.com/sponsors/qdm12
2023-08-19T15:25:00+02:00 INFO [routing] default route found: interface eth0, gateway 172.18.0.1, assigned IP 172.18.0.2 and family v4
2023-08-19T15:25:00+02:00 INFO [routing] local ethernet link found: eth0
2023-08-19T15:25:00+02:00 INFO [routing] local ipnet found: 172.18.0.0/16
2023-08-19T15:25:01+02:00 INFO [firewall] enabling...
2023-08-19T15:25:01+02:00 INFO [firewall] enabled successfully
2023-08-19T15:25:02+02:00 INFO [storage] merging by most recent 17692 hardcoded servers and 17692 servers read from /gluetun/servers.json
2023-08-19T15:25:02+02:00 INFO Alpine version: 3.18.3
2023-08-19T15:25:02+02:00 INFO OpenVPN 2.5 version: 2.5.8
2023-08-19T15:25:03+02:00 INFO OpenVPN 2.6 version: 2.6.5
2023-08-19T15:25:03+02:00 INFO Unbound version: 1.17.1
2023-08-19T15:25:03+02:00 INFO IPtables version: v1.8.9
2023-08-19T15:25:03+02:00 INFO Settings summary:
β”œβ”€β”€ VPN settings:
|   β”œβ”€β”€ VPN provider settings:
|   |   β”œβ”€β”€ Name: custom
|   |   └── Server selection settings:
|   |       β”œβ”€β”€ VPN type: wireguard
|   |       β”œβ”€β”€ Target IP address: [Retracted]
|   |       └── Wireguard selection settings:
|   |           β”œβ”€β”€ Endpoint IP address: [Retracted]
|   |           β”œβ”€β”€ Endpoint port: [Retracted]
|   |           └── Server public key: [Retracted]
|   └── Wireguard settings:
|       β”œβ”€β”€ Private key: ING...Fo=
|       β”œβ”€β”€ Interface addresses:
|       |   └── 10.2.0.2/32
|       β”œβ”€β”€ Allowed IPs:
|           └── MTU: 1400
β”œβ”€β”€ DNS settings:
|   β”œβ”€β”€ Keep existing nameserver(s): no
|   β”œβ”€β”€ DNS server address to use: 127.0.0.1
|   └── DNS over TLS settings:
|       └── Enabled: no
β”œβ”€β”€ Firewall settings:
|   └── Enabled: yes
β”œβ”€β”€ Log settings:
|   └── Log level: INFO
β”œβ”€β”€ Health settings:
|   β”œβ”€β”€ Server listening address: 127.0.0.1:9999
|   β”œβ”€β”€ Target address: quad9.net:443
|   β”œβ”€β”€ Duration to wait after success: 10m0s
|   β”œβ”€β”€ Read header timeout: 100ms
|   β”œβ”€β”€ Read timeout: 500ms
|   └── VPN wait durations:
|       β”œβ”€β”€ Initial duration: 2m0s
|       └── Additional duration: 1m0s
β”œβ”€β”€ Shadowsocks server settings:
|   └── Enabled: no
β”œβ”€β”€ HTTP proxy settings:
|   └── Enabled: no
β”œβ”€β”€ Control server settings:
|   β”œβ”€β”€ Listening address: :8000
|   └── Logging: yes
β”œβ”€β”€ OS Alpine settings:
|   β”œβ”€β”€ Process UID: 1000
|   β”œβ”€β”€ Process GID: 1000
|   └── Timezone: europe/paris
β”œβ”€β”€ Public IP settings:
|   β”œβ”€β”€ Fetching: every 12h0m0s
|   └── IP file path: /tmp/gluetun/ip
└── Version settings:
    └── Enabled: yes
2023-08-19T15:25:03+02:00 INFO [routing] default route found: interface eth0, gateway 172.18.0.1, assigned IP 172.18.0.2 and family v4    
2023-08-19T15:25:03+02:00 INFO [routing] adding route for 0.0.0.0/0
2023-08-19T15:25:03+02:00 INFO [firewall] setting allowed subnets...
2023-08-19T15:25:03+02:00 INFO [routing] default route found: interface eth0, gateway 172.18.0.1, assigned IP 172.18.0.2 and family v4    
2023-08-19T15:25:03+02:00 INFO [dns] using plaintext DNS at address 1.1.1.1
2023-08-19T15:25:03+02:00 INFO [http server] http server listening on [::]:8000
2023-08-19T15:25:03+02:00 INFO [firewall] allowing VPN connection...
2023-08-19T15:25:03+02:00 INFO [healthcheck] listening on 127.0.0.1:9999
2023-08-19T15:25:03+02:00 INFO [wireguard] Using available kernelspace implementation
2023-08-19T15:25:03+02:00 INFO [wireguard] Connecting to [Retracted]:[Retracted]
2023-08-19T15:25:03+02:00 INFO [wireguard] Wireguard setup is complete. Note Wireguard is a silent protocol and it may or may not work, without giving any error message. Typically i/o timeout errors indicate the Wireguard connection is not working.
2023-08-19T15:25:08+02:00 INFO [healthcheck] healthy!
2023-08-19T15:25:08+02:00 INFO [vpn] You are running on the bleeding edge of latest!
2023-08-19T15:25:08+02:00 INFO [ip getter] Public IP address is [Retracted] (Switzerland, Zurich, ZΓΌrich)

Now I have "[wireguard] Using available kernelspace implementation"

And now no more gluetun high CPU usage, but only qbittorrent ! But it also fixed my biggest issue : https://github.com/soxfor/qbittorrent-natmap/issues/16

Now I'm using the wireguard kernel module, I don't have anymore any network issue inside gluetun while using qBittorrent.

Cyph3r commented 1 year ago

@OrpheeGT wow, this sounds interesting. Good work on this! I'm also on Synology, so your fix would be most appreciated. I have no idea how to do what you did. Do you mind providing your wireguard.ko/insmod files and instructions, please? Tyvm

OrpheeGT commented 1 year ago

Hello @Cyph3r As said above, I built Wireguard package for Synology using the docker command from blackvoidclub.

As I'm using broadwellnk CPU achitecture, I did as suggested from the official docker link.

docker run --rm --privileged --env PACKAGE_ARCH=broadwellnk --env DSM_VER=7.2 -v /root/synowirespk72:/result_spk blackvoidclub/synobuild72

It created a SPK package for Synology : WireGuard-broadwellnk-1.0.20220627.spk

But for my own usage, I just opened it with 7zip, searched inside the wireguard.ko module file. Copied it on my NAS.

And I created a planned task with Synology GUI to run the following script as root :

#!/bin/sh

# Create the necessary file structure for /dev/net/tun
if ( [ ! -c /dev/net/tun ] ); then
    if ( [ ! -d /dev/net ] ); then
        mkdir -m 755 /dev/net
    fi
    mknod /dev/net/tun c 10 200
fi

# Load the tun module if not already loaded
if ( !(lsmod | grep -q "^tun\s") ); then
    insmod /lib/modules/tun.ko

fi

# Load the wireguard module if not already loaded
if ( !(lsmod | grep -q "^wireguard\s") ); then
    insmod /var/services/homes/user/wireguard.ko
fi

But you may not need to build the SPK yourself. Just take it from : https://www.blackvoid.club/wireguard-spk-for-your-synology-nas/

Take the one matching your CPU and DSM version.

Cyph3r commented 1 year ago

@OrpheeGT Thanks for the writeup!

rexpark commented 1 year ago

Update: Everything is working great after applying the fix and rebooting.

miiraheart commented 1 year ago

Hello, I am facing the same issue with Ubuntu 22.04 with Gluetun, qBittorrent and ProtonVPN

Cyph3r commented 1 year ago

@OrpheeGT ty so much!

qdm12 commented 1 year ago

@Cyph3r

Per your request for a CPU screenshot

Thanks for all the screenshots, but that wasn't the request πŸ˜‰ The request was to (when using Wireguard in userspace - you can also use WIREGUARD_IMPLEMENTATION=userspace to force it to userspace) run the profiling to see where the CPU usage goes internally within the Gluetun program, as described here).

Switching from gluetun/qbit to image: binhex/arch-qbittorrentvpn reduced CPU from 70-90% to 15-25%.

Maybe because gluetun was using OpenVPN and binhex/arch-qbittorrentvpn was using Wireguard? πŸ€”

rakuri255 commented 1 year ago

Had the same high CPU usage issue on my Synology NAS. It starts automatically in userspace mode.

Thanks to @OrpheeGT for the hint with the kernel implementation! Now it works how it should.

But why is the Userspace mode so demanding?

pduchnovsky commented 9 months ago

@OrpheeGT Thanks for your suggestion of extracting just the .ko file from spk, I was hesitant of installing the entire package since I love simplicity and this helped me to finally get from userspace to kernelspace wireguard implementation :)

OrpheeGT commented 6 months ago

Hello !

@qdm12 is this what you needed ? : profile_cpu_load_userspace.pb.gz profile_cpu_userspace.pb.gz profile_heap_memory_userspace.pb.gz

pprof_heap_memory_userspace

pprof.gluetun-entrypoint.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz pprof.gluetun-entrypoint.samples.cpu.001.pb.gz pprof.gluetun-entrypoint.samples.cpu.002.pb.gz