Open rakbladsvalsen opened 3 years ago
Hey there! Thanks for the detailed issue!
It is a well known Docker problem I need to workaround. Let's keep this opened for now although there is at least one duplicate issue about this problem somewhere in the issues.
Note this only happens if gluetun is updated and uses a different image (afaik).
For now, you might want to have all your gluetun and connected containers in a single docker-compose.yml and docker-compose down && docker-compose up -d
them (what I do).
I'm developing https://github.com/qdm12/deunhealth and should add a feature tailored for this problem soon (give it 1-5 days), feel free to subscribe to releases on that side repo. That way it would watch your containers and restart your connected containers if gluetun gets updated & restarted.
Thank you for the answer @qdm12.
It does seem to be indeed a Docker problem just as you said and unfortunately they seem a bit reluctant to discuss possible solutions for the issue, unfortunately. :(
For the time being, there's a temporary ugly, brutal, but 100% working fix. Maybe it would be worth mentioning it in the wiki/docker-compose.yml example? Although there are some gotchas, since it completely replaces the original healthcheck command, and some images don't include either curl or wget. Currently I'm probing example.com every minute on child containers attached to gluetun's network stack and so far so good.
I just subscribed to deunhealth, seems promising and probably even better than things like autoheal due to the network fix thing. I'll make sure to check it out in a week (or earlier, as you deem appropiate) and provide feedback/do some testing.
Similar conversation in #504 to be concluded.
I have the same thing, when i restart Gluetun, it doesn't want to start the containers within the same network_mode. Only difference is that i configured it with: network_mode: 'container:VPN'.
I think when i restart or recreate the Gluetun container it gets a different ID.
What would be the solution to this problem?
Stumbled across this issue while researching ways to restart dependent containers once gluetun is recreated with a new image (via Watchtower). https://github.com/qdm12/deunhealth seems like it might work, but I wanted to make sure I understand the use case.
If I have a number of services with: network_mode: container:gluetun
However, when the gluetun container restarts, the dependent containers don't actually end up gettin marked unhealthy, they just lose connectivity.
I'm wondering if you've updated deunhealth yet to include this function.
No sorry, but I'll get to it soon.
Ideally, there is a way to re-attach the disconnected containers to gluetun without restarting them (I guess with Docker's Go API since I doubt the docker cli supports such thing). That would work by marking each connected container with a label to indicate this network re-attachment.
If there isn't, I'll setup something to cascade the restart from gluetun to connected containers, probably using labels to avoid any surprise (mark gluetun as a parent container with a unique id, and mark all connected containers as child containers with that same id).
For the time being, if anyone wants a dirty, cheap solution, here's my current setup:
autoheal:
... snip ...
literallyanything:
image: blahblah
container_name: blahblah
network_mode: service:gluetun
restart: unless-stopped
healthcheck:
test: "curl -sf https://example.com || exit 1"
interval: 1m
timeout: 10s
retries: 1
This will only work with containers where curl is already preinstalled. There are docker images that include wget but not curl, in which case you can replace test command with wget --no-verbose --tries=1 --spider https://example.com/ || exit 1
. You can also use qdm12's deunhealth instead of autoheal.
Any progress or resolution to this, either in gluetun or deunhealth?
I have bits and pieces for it, but I am moving country + visiting family + starting a new job right now, so it might take at least 2 weeks for me to finish it up, sorry about that. But it's at the top of my OSS things-to-do list, so it won't be forgotten :wink:
I'd also like to thank you for creating gluetun and to say this is a very good project. Any progress on this?
Any update on this by any chance?
I'm not really sure. I turned off the Watchtower container and since then my setup worked flawlessly. It's a workaround, but it's all I know so far.
Op di 7 mrt 2023 om 01:25 schreef Paul Hawkins @.***>:
Any update on this by any chance?
ā Reply to this email directly, view it on GitHub https://github.com/qdm12/gluetun/issues/641#issuecomment-1457265377, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIU3IYFQ7MQ6JKOGY2H32TW2Z57BANCNFSM5EW2LD4Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Any news or progress on this issue?
following
Since I also have this problem, I would like to report it here and find out if and how it continues. Thank you!
Having the same behavior. When gluetun is recreated, every other container in the same network_mode needs to also restart
Stupid idea to solve this:
Gluetun is given the docker socket, and a list of containers to restart once it comes back up?
I do not think Docker is going to solve this bug. This bug has existed effectively forever.
Welp, had to prove myself wrong: https://github.com/docker/compose/pull/10284
This solution has worked well for me: https://github.com/qdm12/gluetun/issues/641#issuecomment-933856220. I've not had any issues with container connectivity after adding that. Seems like an all right fix to me.
Following, I have this issue. Thank you for the awesome service.
@ismay The workaround wasn't bad, but doesn't work for distroless containers which don't have curl or wget or really anything to do a healthcheck from the inside
@ismay The workaround wasn't bad, but doesn't work for distroless containers which don't have curl or wget or really anything to do a healthcheck from the inside
Ah yeah that's true
Is there a chance this issue will be ever resolved? Thanks for providing gluetun, happy user for a while.
The "network_mode: service:gluetun" statement is incredibly different from a normal "networks" statement.
People's hopes and some's expectations here are a bit overblown (and the "bug" monicker, likely doesn't help) ... if you use a container as a network stack component (router with layer features) ... which you are with that configuration ... and you restart that container its state, and more importantly infrastructure, is gone. Your application isn't going to recover.
The author may be able to provide runtime resilience in the container or even some high availability (see other PR where people want to have multiple gluetuns concurrently), but if wireguard or openvpn were to fall down all he can do is try and reestablish. All the states, port forwards, etc are all gone and will need to be reconstructed.
For example, complaining about connection refused while a router is offline (in this case its ... off) ... yeah, not only is that vpn not running anymore the whole stack (eg, network service) is not running. Let's be realistic.
With "network_mode" ... your container's networking is using another container for its networking ... that also provides additional network services (eg, vpn, proxies, dns, etc).
This gimmick of using the docker feature for easily aggregating containers into that network stack ... great, but this is the downside ... they are all aggregated into that stack.
If you filed "bugs" with docker such as "network_mode services should retain connections and state while service container offline and ..." it's not feasible ... in a single container. And docker plugins (an entirely different thing) have gone into different roadmaps.
Some events, like a vpn falling down, not causing the container to fully restart (and then all the aggregated containers) ... that is likely a resolvable thing.
This is a bug with docker and how docker handles network_mode:container
. Until they fix that bug, we're basically stuck.
If anybody wants to give it a try, I have written cascandaliato/docker-restarter
. Right now it covers only one scenario: if A depends on B and B restarts then restart A
.
Gluetun updated twice in a few minutes, which in turn i had to restart the whole stack depending on it, twice.
@cascandaliato I will try for sure.
If anybody wants to give it a try, I have written
cascandaliato/docker-restarter
. Right now it covers only one scenario:if A depends on B and B restarts then restart A
.
That works a treat. Thank you.
I did find I had to put it outside of the stack. Your container would attempt to restart itself. But actually stop if it was inside the stack, and not restart the other container's. Possibly a issue from my side thought
@Blavkentropy1, I've opened cascandaliato/docker-restarter#2 to keep track of that problem but I'll need your help to understand what's happening. Whenever you have time, we can continue the discussion in the other issue because I don't want to add noise to this one.
Also having this issue sadly!
Drove myself mad thinking it was a problem with defining FIREWALL_OUTBOUND_SUBNETS.
Ran into same issue. I have a container monitoring gluetun for ip leaks or dns failures as sometimes it locks up, leaks ip, etc.
If something is a miss, it restarts gluetun
, then all its child containers with:
docker restart $(docker ps -q --filter "label=com.docker.compose.depends_on=gluetun:service_started:true")
You can get the docker compose related labels from docker inspect <container-name>
This can easily be solved using a native healthcheck in docker. There's no need for a third party party application to monitor the health of your containers. For example, if gluetun is restarted, dependent containers will lose network connectivity. To get around this, your healthcheck can periodically monitor externally connectivity, then kill the main pid (1
) if no connectivity, thus killing the container, but if you use a restart: always
configuration, docker will recreate that container which then reconnects to gluetun. This will happen in an infinite loop until connectivity is re-established to google.com
in the below example. Here's a sample docker-compose
configuration:
version: "3"
services:
mycontainer:
image: namespace/myimage
container_name: mycontainer
restart: always
healthcheck:
test: "curl -sfI -o /dev/null --connect-timeout 10 --retry 3 --retry-delay 10 --retry-all-errors https://www.google.com/robots.txt || kill 1"
interval: 1m
timeout: 1m
network_mode: "service:gluetun"
depends_on:
- gluetun
gluetun:
image: qmcgaw/gluetun
container_name: gluetun
cap_add:
- NET_ADMIN
devices:
- /dev/net/tun:/dev/net/tun
ports:
- 8888:8888/tcp # HTTP proxy
- 8388:8388/tcp # Shadowsocks
- 8388:8388/udp # Shadowsocks
- 8080:8080/tcp # gluetun
volumes:
- ${INSTALL_DIRECTORY}/config/gluetun:/config
environment:
- VPN_ENDPOINT_IP=${VPN_ENDPOINT_IP}
- VPN_ENDPOINT_PORT=${VPN_ENDPOINT_PORT}
- VPN_SERVICE_PROVIDER=${VPN_SERVICE}
- VPN_TYPE=wireguard
- WIREGUARD_PUBLIC_KEY=${WIREGUARD_PUBLIC_KEY}
- WIREGUARD_PRIVATE_KEY=${WIREGUARD_PRIVATE_KEY}
- WIREGUARD_ADDRESSES=${WIREGUARD_ADDRESSES}
- DNS_ADDRESS=${DNS_ADDRESS}
- UPDATER_PERIOD=12h
restart: always
:warning: Note: This assumes you have
curl
available in the running container dependent ongluetun
Not all images support curl. xTeVe for example. You can use wget as a healthcheck also.
healthcheck: # https://github.com/qdm12/gluetun/issues/641#issuecomment-933856220
# must use wget for the healthcheck as this image does not have curl
test: "wget --no-verbose --tries=1 --spider http://ipinfo.io/ip || exit 1"
interval: 1m
timeout: 10s
retries: 1
Welp, had to prove myself wrong: docker/compose#10284
If I understand well this feature, Docker won't restart the container if gluetun is unhealthy, but only if gluetun is restarted by a compose operation.
https://docs.docker.com/compose/compose-file/05-services/#long-syntax-1
This seems to still be a problem with no 100% satisfying solution. I also have the problem that when I restart the server, I have to manually docker compose up
the gluetun stack because otherwise the other services never launch with the error cannot join network of a non running container
.
@qdm12 is this something you are planning to have a satisfying solution for from your side, or should we be looking for solutions elsewhere. Docker Compose have pretty much said it's not their problem, and directed us towards Moby, but I will admit I don't really have enough of an understanding of all the moving parts here to be able to tell exactly what Moby would actually need to add/fix to make this work.
Let me know if there's something I can do to help or an issue I can upvote, but I don't really want to spend time reading about all the moving parts around docker to understand exactly who needs to do what if I don't have to š .
Thanks for gluetun, other than this one inconvenience it has been excellent for the last couple months.
This seems to still be a problem with no 100% satisfying solution. I also have the problem that when I restart the server, I have to manually
docker compose up
the gluetun stack because otherwise the other services never launch with the errorcannot join network of a non running container
.@qdm12 is this something you are planning to have a satisfying solution for from your side, or should we be looking for solutions elsewhere. Docker Compose have pretty much said it's not their problem, and directed us towards Moby, but I will admit I don't really have enough of an understanding of all the moving parts here to be able to tell exactly what Moby would actually need to add/fix to make this work.
Let me know if there's something I can do to help or an issue I can upvote, but I don't really want to spend time reading about all the moving parts around docker to understand exactly who needs to do what if I don't have to š .
Thanks for gluetun, other than this one inconvenience it has been excellent for the last couple months.
I'm using this guy https://github.com/cascandaliato/docker-restarter and it has been great for me!
The health check workaround works flawlessly for me.
This seems to still be a problem with no 100% satisfying solution. I also have the problem that when I restart the server, I have to manually
docker compose up
the gluetun stack because otherwise the other services never launch with the errorcannot join network of a non running container
.
I built myself a systemd service which runs 30 seconds after the docker service has started and starts all containers with the cannot join network of a non running container
error message: https://github.com/ioqy/docker-start-failed-gluetun-containers
@ioqy This seems like a fundamentally better approach to me, thanks for the link.
This can easily be solved using a native healthcheck in docker. There's no need for a third party party application to monitor the health of your containers. For example, if gluetun is restarted, dependent containers will lose network connectivity. To get around this, your healthcheck can periodically monitor externally connectivity, then kill the main pid (
1
) if no connectivity, thus killing the container, but if you use arestart: always
configuration, docker will recreate that container which then reconnects to gluetun. This will happen in an infinite loop until connectivity is re-established togoogle.com
in the below example. Here's a sampledocker-compose
configuration:version: "3" services: mycontainer: image: namespace/myimage container_name: mycontainer restart: always healthcheck: test: "curl -sfI -o /dev/null --connect-timeout 10 --retry 3 --retry-delay 10 --retry-all-errors https://www.google.com/robots.txt || kill 1" interval: 1m timeout: 1m network_mode: "service:gluetun" depends_on: - gluetun gluetun: image: qmcgaw/gluetun container_name: gluetun cap_add: - NET_ADMIN devices: - /dev/net/tun:/dev/net/tun ports: - 8888:8888/tcp # HTTP proxy - 8388:8388/tcp # Shadowsocks - 8388:8388/udp # Shadowsocks - 8080:8080/tcp # gluetun volumes: - ${INSTALL_DIRECTORY}/config/gluetun:/config environment: - VPN_ENDPOINT_IP=${VPN_ENDPOINT_IP} - VPN_ENDPOINT_PORT=${VPN_ENDPOINT_PORT} - VPN_SERVICE_PROVIDER=${VPN_SERVICE} - VPN_TYPE=wireguard - WIREGUARD_PUBLIC_KEY=${WIREGUARD_PUBLIC_KEY} - WIREGUARD_PRIVATE_KEY=${WIREGUARD_PRIVATE_KEY} - WIREGUARD_ADDRESSES=${WIREGUARD_ADDRESSES} - DNS_ADDRESS=${DNS_ADDRESS} - UPDATER_PERIOD=12h restart: always
ā ļø Note: This assumes you have
curl
available in the running container dependent ongluetun
I tried this but the containers didn't restart. Am I missing something? Do all the different services dependent on gluetun have to be in the same docker compose file for this to work?
I can confirm that "curl" is working in this container
This is an example of one of my containers compose files using this "work around"
version: "2.1"
services:
flaresolverr:
# DockerHub mirror flaresolverr/flaresolverr:latest
image: ghcr.io/flaresolverr/flaresolverr:latest
container_name: flaresolverr
environment:
- LOG_LEVEL=${LOG_LEVEL:-info}
- LOG_HTML=${LOG_HTML:-false}
- CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
- TZ=America/Los Angeles
network_mode: "container:gluetun"
healthcheck:
test: "curl -sfI -o /dev/null --connect-timeout 10 --retry 3 --retry-delay 10 --retry-all-errors https://www.google.com/robots.txt || kill 1"
interval: 1m
timeout: 1m
restart: always
You're missing some spaces in your healthcheck. Indenting must be exact. I had the same issue until i fixed those indents on the healthcheck.
Thanks for the help/feedback @vdrover. I ran all my containers through vs code as docker compose files and ran "format document". Hopefully that should fix it. I'll restart gluetun and see if that makes a difference.
@qdm12. I would recommend adding the results of this thread to the wiki. It seems like a pretty important issue that should be flagged to users when setting up this project.
I do not think it is a good idea to curl a public website every minute or so. This causes unnecessary traffic for you and for the website (even if they have the bandwidth, as Google surely does) - especially considering that there may be multiple containers connected to gluetun, all doing the same checks every minute. I solved this by simply checking the healthcheck address of gluetun itself.
0.0.0.0:9999
(Wiki)nc -z localhost 9999 || kill 1
(it's a bit faster than curl)Could you provide an example healthcheck?
To use the example from above:
version: "2.1"
services:
flaresolverr:
# DockerHub mirror flaresolverr/flaresolverr:latest
image: ghcr.io/flaresolverr/flaresolverr:latest
container_name: flaresolverr
environment:
- LOG_LEVEL=${LOG_LEVEL:-info}
- LOG_HTML=${LOG_HTML:-false}
- CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
- TZ=America/Los Angeles
network_mode: "container:gluetun"
healthcheck:
test: "nc -z localhost 9999 || kill 1"
interval: 1m
timeout: 1m
restart: always
This requires netcat (nc
) to be present in the container. Since flaresolverr is connected with container:gluetun
, the target server can be specified as localhost
, and port 9999
is the one you have to configure with the environment variable of the gluetun container itself (HEALTH_SERVER_ADDRESS="0.0.0.0:9999"
). But the healthcheck can be copied "as is" without modification to any container you connect to gluetun.
It can be tested from the Docker host by running the following command inside the container you have connected to gluetun:
sudo docker exec mycontainer nc -z -v localhost 9999
test: "nc -z localhost 9999 || kill 1"
@Babadabupi: excellent idea, thank you
@Babadabupi or someone else here. Can you please provide some guidance on how to add linux commands/tools so they are accessible via commands in Docker? curl works, but netcat isn't found and I can't seem to see how to get it accessible. I'm running Docker (and Portainer) on a Synology server running the latest DSM (7.2 I think). Thanks!
@begunfx: If you can't use the command inside the container, then nc isn't installed inside. You can add it (e.g. debian based container, if you have another Linux image then change bash and apt if necessary):
docker exec [containtername] bash
apt-get update & apt-get install netcat
just remeber if you update the container then the installed netcat will be gone. So you can write a bash script to to this or you are building your own container with netcat installed by default
Thanks for the response, insight and suggestions @cybermcm. I need to have netcat available to all my containers that connect to gluetun permanently. So what would you recommend for this? Is there a way to have a docker compose file execute a bash script? I found this link that talks about it a bit, but I'm not too clear if this is the best way to go: https://stackoverflow.com/questions/57840820/run-a-shell-script-from-docker-compose-command-inside-the-container
Or is there a way I can just install a container that has netcat in it and have other docker containers use it to run netcat? I am able to install Ubuntu as a docker container, but not clear on how to share resources. I did set the Ubuntu container to use the Gluetun container as its network so all my containers that need access to it are in the same network - from what I understand (but this didn't work).
Update: I did file the following docker compose command that seems to do the trick to execute a shell script:
command: /bin/bash -c "init_project.sh"
I found it at this post: https://forums.docker.com/t/cant-use-command-in-compose-yaml/127427
Something like that?
Is this urgent?: No (kinda it is, since this causes complete connection loss if this "bug" happens)
Host OS: Tested on both Fedora 34 and (up-to-date) Arch Linux ARM (32bit/RPi 4B)
CPU arch or device name: amd64 & armv7
What VPN provider are you using: NordVPN
What are you using to run your container?: Docker Compose
What is the version of the program
Steps to reproduce issue:
-exec it
into the container and run curl/wget/ping/etc:Expected behavior: xyz should have internet connectivity through gluetun's network stack and be accesible through gluetun's published/exposed ports, even if gluetun is restarted. This is, unfortunately not the case: xyz's network stack just dies, no data in, no data out.
Additional notes:
FIREWALL_OUTBOUND_SUBNETS
- didn't make a difference.network_mode: service:gluetun
completely disappear. b) Restarting gluetun doesn't bring back original routing tables. c)NetworkMode
seems to be okay.Terminal example
Brief
docker inspect
output from affected containerf77[...] is gluetun's container ID.
Full gluetun logs:
docker-compose.yml:
Nonetheless I'd like to thank you for creating gluetun. I'd be more than happy to help you fix this issue if this is a gluetun bug. Hopefully it's a misconfiguration in my side.