Open EzekialSA opened 3 years ago
i seccond that... Thats exactly my problem...
i have disabled updates for gluetun to stop my containers from dangling without network.
if that is fixable, i would be very glad!!!
That's really strange. So the container can no longer be found with its container ID?! I'll do some more testing.
Meanwhile I'm almost done on a cascaded restart feature which should restart containers labeled for it when a certain container starts (like gluetun).
Ah got it. It's because the container ID it was relying on (gluetun) disappeared. Ugh, that's also going to be problematic for my cascaded restart feature... I think the (connected) container config needs to be patched somehow, before being restarted 🤔
Ok so after some research... There is no way to know what the 'vpn' container was since we just have its ID and it no longer exists (the name is not accessible). I guess it could stop it, but it wouldn't be able to start it again, so that's a bit pointless sadly.
Now on my cascaded restart feature, the idea is that you would put a label on the 'connected' containers indicating the container name of the 'vpn' container. That way, this is feasible. Writing out how it should do it (also for myself):
I have bits and pieces of it ready, I just need to wire everything up and try it out, but it should work fine.
So... this previous suggestion, let's call it A
, won't work if Deunhealth starts and a VPN container has already been shutdown/restarted and existing containers are disconnected, before deunhealth started. The only solution, call it B
, I can think of is to use labels for both the VPN container and the connected containers and not rely on container names. For example have a unique label ID for the 'vpn' container, and use it for all the connected containers.
I also came up with another solution, let's call it C
, which is also more complex to implement, only relying on container names (no label), although it has the same problem mentioned above. Here's how it would work (notes to myself as well):
Solutions comparison
Solution | Works on previously disconnected containers at start | Works without label for VPN container | Works without labels for VPN connected containers | Does not need state |
---|---|---|---|---|
A |
❎ | ✔️ | ❎ | ✔️ |
B |
✔️ | ❎ | ❎ | ✔️ |
C |
❎ | ✔️ | ✔️ | ❎ |
Now what solution do you prefer 😄 ????
I'm leaning towards B
to have something that works, although it requires more user fiddling.
Personally I lean towards B as well. Involves more up front config with labels, but it allows for more verbosity with what is connected, forcing the user to make that link.
Solution A, Auto monitoring and logging container information isn't a terrific solution to me.
Solution C, dropping context of containers seems like too much effort, and could cause some issue if someone has multiple stacks with overlapping configured names over a cluster...bad practice, but could cause a headache for someone down the line.
I pick B. I was elected to lead, not to read! (SCNR)
Labels would be perfectly fine for me.
Also it sounds like a litle less work from your side, with the labels implementation.
Another vote for option B.
+1 for option B and do you know when it will be released ?
+1 for option B
I'm working on it right now! Hopefully we will have something today :wink:
EDIT (2021-12-06): still working on it, it's a bit more convoluted than I expected code-spaghetti wise, but it's getting there!
Note if the 'network container' (aka the vpn) goes down and doesn't restart, there is no way to restart properly the connected containers since the label won't be anywhere unfortunately. I will make the program log it out as a warning if this happens.
i'm not sure if i got this right.
you are not able to restart the "child" containers, if the vpn server did kill itself and did not restart, right?
But if the container is updated and did restart without errors, that is still possible to fix with the intended patch?
In my case I just need to recreate containers attached to the network container when recreated by watchtower. The network container is always up and running but the others containers are orphans and cannot be restarted.
any eta ?
Has this been implemented yet?
A little late to the party here, but definitely also prefer option B and I'm very excited about this feature.
(yes, my gluetun container got updated by watchtower last night and now the whole stack is down 😄 )
Hello all, good news, I'm working again on this. Sorry for the immense delay I took to get back working on this. I have some 'new uncommited' code (from like 6 months ago lol) that looks promising, I'm hoping for a solution B implementation soon! :+1:
Should this already be working in a current version combined with using deunhealth? I'm still using an older image of gluetun (v3.28.2) so it doesn't get automatically updated by watchtower. When it does get updated, connectivity to apps using gluetun is lost (https://github.com/qdm12/deunhealth/discussions/34). Or should I still manually update gluetun for now?
Any update? :)
I guess Quentin hasn't had time to implement the deunhealth.restart.on.unhealthy=true
label yet, or else it's a more difficult task that initially thought? Doesn't work for me yet.
deunhealth log states 0 containers monitored, despite tagging several containers with deunhealth.restart.on.unhealthy=true
2023/08/04 10:44:19 INFO Monitoring 0 containers to restart when becoming unhealthy
I turn my mini-PC media server off every evening. So I've been able to use a shell script that does a docker compose down && docker compose up -d
2 mins after the server first boots up (Quentin recommends running similar as a workaround). This fixes my stack... at least for some hours. Sometimes something breaks, and if if that happens I just power it off and on again! Looking forward to a more robust solution :-)
@STRAYKR Is your deun container in the same yml as gluetun? That was my issue. Logs showed "Monitoring 0 containers" when I added the label to gluetun but deun was in its own yml. When I moved deun to the same yml compose as gluetun and qbittorrent, deun registered the labels and started monitoring the containers. I'm thinking, for my case, that the issue might've been that deun couldn't reach gluetun because it wasn't on the same network.
Hello guys. It still doesn't work.
`2023/12/30 19:07:39 INFO container qbittorrent (image lscr.io/linuxserver/qbittorrent:latest) is unhealthy, restarting it... 2023/12/30 19:07:43 ERROR failed restarting container: Error response from daemon: Cannot restart container qbittorrent: No such container: 66cfe13371d1b10781c4a0649f96c8a82044f3852a2bbd77524c6f92b1902e35
2023/12/30 19:18:51 INFO container transmission (image lscr.io/linuxserver/transmission:latest) is unhealthy, restarting it... 2023/12/30 19:18:55 ERROR failed restarting container: Error response from daemon: Cannot restart container transmission: No such container: 72a8f02b433e0b443812be3a44171ece10b9cc6191b7d9bcba8fc6cdb012d125`
@STRAYKR Is your deun container in the same yml as gluetun? That was my issue. Logs showed "Monitoring 0 containers" when I added the label to gluetun but deun was in its own yml. When I moved deun to the same yml compose as gluetun and qbittorrent, deun registered the labels and started monitoring the containers. I'm thinking, for my case, that the issue might've been that deun couldn't reach gluetun because it wasn't on the same network.
Hi @NaturallyAsh, sorry for the delayed response, yes, all config for deun and gluetun is in the same yml docker compose file, I only have the one docker compose file.
hi guys Any update on this?
Just chiming in to keep this issue at least somewhat active. 😄
I'm trying to configure everything to be automated with updates and availability using watch tower and deunhealth. I was doing testing to see what would happen if gluetun got an update (as you know it breaks things connected to it when it restarts). I get the following errors when stopping/restarting gluetun:
I believe that the gluetun container is the one that's referenced by that hash, so it disappears and deunhealth doesn't know how to handle it.
I don't think it's worth noting, but I am using portainer for stack management. Here are my config files of what I'm trying to do: