DyonR / docker-Jackettvpn

Docker container which runs a headless Jackett Server with an optional WireGuard or OpenVPN connection
https://hub.docker.com/r/dyonr/jackettvpn/
GNU General Public License v3.0
70 stars 31 forks source link

VPN idle disconnect #25

Closed evakq8r closed 4 years ago

evakq8r commented 4 years ago

Similar issue to what I raised in https://github.com/DyonR/docker-Jackettvpn/issues/9, but different environment (Unraid). Started a new issue only because the architecture I'm now using is different, but the issues are the same.

Environment: Server: Unraid 6.8.3 2020-03-05 JackettVPN version: 0.16.912.0

Identical symptoms:

Wed Jul 29 12:55:22 2020 [] Inactivity timeout (--ping-restart), restarting Wed Jul 29 12:55:22 2020 SIGUSR1[soft,ping-restart] received, process restarting Wed Jul 29 12:55:27 2020 TCP/UDP: Preserving recently used remote address: [AF_INET] Wed Jul 29 12:55:27 2020 UDP link local: (not bound) Wed Jul 29 12:55:27 2020 UDP link remote: [AF_INET] Wed Jul 29 12:55:28 2020 [8ccb8d30368dc2172c55cd565d641d80] Peer Connection Initiated with [AF_INET] Wed Jul 29 12:55:29 2020 AUTH: Received control message: AUTH_FAILED Wed Jul 29 12:55:29 2020 /sbin/ip addr del dev tun0 local 10.1.11.10 peer 10.1.11.9 Wed Jul 29 12:55:29 2020 SIGTERM[soft,auth-failure] received, process exiting

The Unraid container already has --restart=unless-stopped set when it was pulled from the Unraid app store, however it has not restarted the container (still running, but dead as the VPN tunnel dropped).

I had noticed that my time logs were defaulting to UTC, so I've added the TZ variable and that has since corrected the times, so will keep an eye on it.

@DyonR, has this issue cropped up again in the newest version of Jackett?

DyonR commented 4 years ago

It seems like there might be a problem with the healthcheck script. In the container I took eth0 offline, to kill the internet, and it doesn't restart automatically. Will investigate futher.

DyonR commented 4 years ago

It seems like Docker does not react to the exit 1 in the healthcheck.sh-script.
I will move it into the start.sh script with a while true loop with a default 'sleep' of 5 minutes, which could be changed with an environment variable also.
HEALTH_CHECK_INTERVAL would default to a sleep of 300 seconds if not defined
I'll probably push this within 30 minutes

evakq8r commented 4 years ago

@DyonR, Jackett has updated (0.16.916.0), but has introduced a new bug I think. Previous versions never failed TLS handshaking, only this one has.

Since updating, TLS handshakes now fail at least once every start, but will renegotiate within 1-2 seconds of the failure:

Excerpt of run log:

Thu Jul 30 09:48:44 2020 OpenVPN 2.4.4 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on May 14 2019
Thu Jul 30 09:48:44 2020 library versions: OpenSSL 1.1.1 11 Sep 2018, LZO 2.08
Thu Jul 30 09:48:44 2020 TCP/UDP: Preserving recently used remote address: [AF_INET]<endpoint IP>
Thu Jul 30 09:48:44 2020 UDP link local: (not bound)
Thu Jul 30 09:48:44 2020 UDP link remote: [AF_INET]<endpoint IP>
Thu Jul 30 09:49:44 2020 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)

Thu Jul 30 09:49:44 2020 TLS Error: TLS handshake failed

Thu Jul 30 09:49:44 2020 SIGUSR1[soft,tls-error] received, process restarting

Thu Jul 30 09:49:44 2020 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)

Thu Jul 30 09:49:44 2020 TLS Error: TLS handshake failed

Thu Jul 30 09:49:44 2020 SIGUSR1[soft,tls-error] received, process restarting

Thu Jul 30 09:49:49 2020 TCP/UDP: Preserving recently used remote address: [AF_INET]<endpoint IP>
Thu Jul 30 09:49:49 2020 UDP link local: (not bound)
Thu Jul 30 09:49:49 2020 UDP link remote: [AF_INET]<endpoint IP>
Thu Jul 30 09:49:49 2020 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this

Thu Jul 30 09:49:50 2020 [077bd04b63ab5ae41d0793e6b877a56e] Peer Connection Initiated with [AF_INET]<endpoint IP>
Thu Jul 30 09:49:50 2020 [077bd04b63ab5ae41d0793e6b877a56e] Peer Connection Initiated with [AF_INET]<endpoint IP>
Thu Jul 30 09:49:51 2020 TUN/TAP device tun0 opened
Thu Jul 30 09:49:51 2020 do_ifconfig, tt->did_ifconfig_ipv6_setup=0
Thu Jul 30 09:49:51 2020 /sbin/ip link set dev tun0 up mtu 1500
Thu Jul 30 09:49:51 2020 /sbin/ip addr add dev tun0 local 10.1.11.10 peer 10.1.11.9
Thu Jul 30 09:49:51 2020 Initialization Sequence Completed
2020-07-30 09:49:52.217478 [info] WebUI port defined as
2020-07-30 09:49:52.244260 [info] Docker network defined as 172.18.0.0/16
2020-07-30 09:49:52.264333 [info] Adding 192.168.1.0/24 as route via docker eth0
2020-07-30 09:49:52.282892 [info] ip route defined as follows...

I've restarted the server entirely and made sure DNS isn't the issue, no other containers complain about negotiation failures (I use a few with OpenVPN built in). I've also restarted this container several times and can reproduce the failure every time.

Once it renegotiates for a second time however, Jackett seems to function fine.

DyonR commented 4 years ago

Do you use a commercial VPN provider like Windscribe, NordVPN, PIA, etc? Or do you use a private VPN provider that you host yourself?
If it is an commercial provider, try exporting a new OpenVPN profile and put that in your openvpn config folder of jackettvpn If the handshake fails that could also mean that the VPN Provider doesn't respond to your request, or that your firewall is blocking it.
See also this FAQ on the OpenVPN Website
Edit: Can you connect to the exact same provider fine from another device? Like your own PC?

evakq8r commented 4 years ago

I use PIA for all containers that run OpenVPN (6 in total).

Yes, I can connect using my PC as well with existing config files (Windows PC). The config files are not all the same (6 different endpoints) and they all seem to connect just fine using Windows. It's unlikely to be firewall otherwise other containers would be experiencing the same problems.

evakq8r commented 4 years ago

Selected a completely different endpoint that I haven't used before, identical symptoms:

Thu Jul 30 10:10:33 2020 library versions: OpenSSL 1.1.1 11 Sep 2018, LZO 2.08
Thu Jul 30 10:10:33 2020 TCP/UDP: Preserving recently used remote address: [AF_INET]<endpoint IP>
Thu Jul 30 10:10:33 2020 UDP link local: (not bound)
Thu Jul 30 10:10:33 2020 UDP link remote: [AF_INET]<endpoint IP>
Thu Jul 30 10:11:34 2020 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)

Thu Jul 30 10:11:34 2020 TLS Error: TLS handshake failed

Thu Jul 30 10:11:34 2020 SIGUSR1[soft,tls-error] received, process restarting

Thu Jul 30 10:11:34 2020 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)

Thu Jul 30 10:11:34 2020 TLS Error: TLS handshake failed

Thu Jul 30 10:11:34 2020 SIGUSR1[soft,tls-error] received, process restarting

Thu Jul 30 10:11:39 2020 TCP/UDP: Preserving recently used remote address: [AF_INET]<endpoint IP>
Thu Jul 30 10:11:39 2020 UDP link local: (not bound)
Thu Jul 30 10:11:39 2020 UDP link remote: [AF_INET]<endpoint IP>
Thu Jul 30 10:11:39 2020 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this

Thu Jul 30 10:11:40 2020 [0d8caca97c25dc3637cd422cf1b047bb] Peer Connection Initiated with [AF_INET]<endpoint IP>
Thu Jul 30 10:11:40 2020 [0d8caca97c25dc3637cd422cf1b047bb] Peer Connection Initiated with [AF_INET]<endpoint IP>
Thu Jul 30 10:11:41 2020 TUN/TAP device tun0 opened
Thu Jul 30 10:11:41 2020 do_ifconfig, tt->did_ifconfig_ipv6_setup=0
Thu Jul 30 10:11:41 2020 /sbin/ip link set dev tun0 up mtu 1500
Thu Jul 30 10:11:41 2020 /sbin/ip addr add dev tun0 local 10.59.10.6 peer 10.59.10.5
Thu Jul 30 10:11:41 2020 Initialization Sequence Completed
2020-07-30 10:11:42.521208 [info] WebUI port defined as
2020-07-30 10:11:42.549441 [info] Docker network defined as 172.18.0.0/16
2020-07-30 10:11:42.573634 [info] Adding 192.168.1.0/24 as route via docker eth0
2020-07-30 10:11:42.596446 [info] ip route defined as follows...

As a test, I used the same newly generated config file in another container (binhex-rtorrentvpn) and it authenticated immediately.

Thu Jul 30 10:15:31 2020 OpenVPN 2.4.9 [git:makepkg/9b0dafca6c50b8bb+] x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Apr 20 2020
Thu Jul 30 10:15:31 2020 library versions: OpenSSL 1.1.1g 21 Apr 2020, LZO 2.10

2020-07-30 10:15:31,562 DEBG 'start-script' stdout output:
[info] OpenVPN started

2020-07-30 10:15:31,562 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:31 2020 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts

2020-07-30 10:15:31,563 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:31 2020 CRL: loaded 1 CRLs from file [[INLINE]]

2020-07-30 10:15:31,563 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:31 2020 TCP/UDP: Preserving recently used remote address: [AF_INET]<endpoint IP>
Thu Jul 30 10:15:31 2020 UDP link local: (not bound)
Thu Jul 30 10:15:31 2020 UDP link remote: [AF_INET]<endpoint IP>

2020-07-30 10:15:32,897 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:32 2020 [2994d337175e81bf2d5cef92af435a3a] Peer Connection Initiated with [AF_INET]<endpoint IP>

2020-07-30 10:15:34,381 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:34 2020 TUN/TAP device tun0 opened
Thu Jul 30 10:15:34 2020 /usr/bin/ip link set dev tun0 up mtu 1500

2020-07-30 10:15:34,382 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:34 2020 /usr/bin/ip addr add dev tun0 local 10.9.10.6 peer 10.9.10.5

2020-07-30 10:15:34,382 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:34 2020 /root/openvpnup.sh tun0 1500 1570 10.9.10.6 10.9.10.5 init

2020-07-30 10:15:34,387 DEBG 'start-script' stdout output:
Thu Jul 30 10:15:34 2020 Initialization Sequence Completed
DyonR commented 4 years ago

Honestly, this is quite hard for me to troubleshoot.
I changed nothing related to OpenVPN, Jackett or the iptables that could change the function of connecting to an OpenVPN server since the last days.
I would love to help you out, but I hope you also understand it's hard for me to troubleshoot something that I cannot reproduce. The error log of OpenVPN is also not that extensive to see what part exactly fails. If I can think of something I'll let you know asap though.
I also pushed a new update of Jackettvpn, you could try updating to see if maybe one of the updated packages changed something perhaps.

evakq8r commented 4 years ago

No problems, I understand. I wasn't complaining per se, more making an observation.

Ultimately it won't really affect how JackettVPN works, as long as the renegotiation attempts are successful on the second attempt. Real world means I wait another 60 seconds before I can get into JackettVPN... I think I can handle that. :)

I've forced pull an update in the docker container (version seems to still end in .916 but there were additional downloads completed), and wouldn't you know it, the issue is no longer present:

Thu Jul 30 11:03:03 2020 OpenVPN 2.4.4 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on May 14 2019
Thu Jul 30 11:03:03 2020 library versions: OpenSSL 1.1.1 11 Sep 2018, LZO 2.08
Thu Jul 30 11:03:03 2020 TCP/UDP: Preserving recently used remote address: [AF_INET]<endpoint IP>
Thu Jul 30 11:03:03 2020 UDP link local: (not bound)
Thu Jul 30 11:03:03 2020 UDP link remote: [AF_INET]<endpoint IP>
Thu Jul 30 11:03:03 2020 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this

Thu Jul 30 11:03:04 2020 [ebcbba25d98bc619f55d22e5d6edeb54] Peer Connection Initiated with [AF_INET]<endpoint IP>
Thu Jul 30 11:03:06 2020 TUN/TAP device tun0 opened
Thu Jul 30 11:03:06 2020 do_ifconfig, tt->did_ifconfig_ipv6_setup=0
Thu Jul 30 11:03:06 2020 /sbin/ip link set dev tun0 up mtu 1500
Thu Jul 30 11:03:06 2020 /sbin/ip addr add dev tun0 local 10.1.11.6 peer 10.1.11.5
Thu Jul 30 11:03:06 2020 Initialization Sequence Completed

This is surprising as I restarted the other container at least 20 times and it reproduced the error every single time. I'm confused as hell, but it is what it is.

I appreciate the help provided @DyonR. If the original issue re-presents itself, I'll raise a new job and relate back to this one.

DyonR commented 4 years ago

You didn't sound like complaining, no worries! Was just frustrated that I couldn't help you well 😞
The update wasn't because Jackett updated, but because I added something to the container 😉
The update I pushed is to make it possible to set a Web UI password via an environment variable (WEBUI_PASSWORD) 😄
This was possible via Jackett's Web UI already, but now also in a different way. Nevertheless, really weird that it works on the first try now! Great to hear 👍