Open Rush opened 5 years ago
I've noticed the same thing and if the above could be implemented, that would be awesome :-)
Thanks for your issue! This is definitely something we should take a look at. If you feel up for it, feel free to submit a pull request and I'll have a look. 👍
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Why would a real issue be closed due to inactivity?
Still trimming in stale-bot. Some false positives remain to be ironed out.
And just to elaborate a bit on why we do this; I think the stale issues repo explains it all in a very good way:
In an ideal world with infinite resources, there would be no need for this app.
But in any successful software project, there's always more work to do than people to do it. As more and more work piles up, it becomes paralyzing. Just making decisions about what work should and shouldn't get done can exhaust all available resources. In the experience of the maintainers of this app—and the hundreds of other projects and organizations that use it—focusing on issues that are actively affecting humans is an effective method for prioritizing work.
To some, a robot trying to close stale issues may seem inhospitable or offensive to contributors. But the alternative is to disrespect them by setting false expectations and implicitly ignoring their work. This app makes it explicit: if work is not progressing, then it's stale. A comment is all it takes to keep the conversation alive.
With that said, your issue is added to a milestone as this might become an actual problem, and as such - wont be marked as stale.
Thanks for understanding. 🙏
I stopped using watchtower because of this issue.
I am looking for a way that instruct watchtower don't stop all my containers at the same time. This is really a problem! Lets say you have 3 instances behind a load balancer, watchtower will stop them all.
As a work-around, you might run multiple watchtower instances, one instance for each container you want to monitor.
Is this still an issue? Thinking about implementing watchtower, but with this kind of behavior it won't be good for my scenario. I have more than one hundred containers using the same image in the same server. I really need something more close from what OP said.
Yes, this is still how it works. However, I'd be more than open to changing this behavior, although it would require some help from the community as I, to be fair, lack time at this point.
Greetings @simskij !
Is this issue open to be worked on? I'd love to have a go at it if available.
Thank you!
For sure, go for it! 🙏🏼
Thank you!
@simskij I ran into some trouble while trying out the application. Should I mention them here or on Gitter?
Here is better if someone else wants to assist, but Gitter works just as well! 👌
Awesome!
Here is the issue I ran into:
DEBU[0100] Got image name: altariax0x01/mybuntu:latest
INFO[0100] Found new altariax0x01/mybuntu:latest image (sha256:77e1d6c5b9c0f022928f1732791ccd12fcb6029baf686b4cfcebafe7dbce6ec7)
INFO[0100] Stopping /t1 (bbd9ce79fad7737c0fa0c9512d526d286ad38565004dcbfd123adfbed11ff0d6) with SIGTERM
DEBU[0101] Removing container bbd9ce79fad7737c0fa0c9512d526d286ad38565004dcbfd123adfbed11ff0d6
2020/08/15 15:46:46 cron: panic running job: runtime error: invalid memory address or nil pointer dereference
goroutine 13 [running]:
github.com/robfig/cron.(*Cron).runWithRecovery.func1(0xc0002c8500)
/home/ubuntu/go/pkg/mod/github.com/robfig/cron@v0.0.0-20180505203441-b41be1df6967/cron.go:161 +0x9e
panic(0xae3ba0, 0x1021190)
/home/ubuntu/go/src/runtime/panic.go:969 +0x175
github.com/containrrr/watchtower/pkg/container.Container.runtimeConfig(0x100, 0xc000485d40, 0x0, 0xc000392480)
/home/ubuntu/watchtower/pkg/container/container.go:169 +0x4e
github.com/containrrr/watchtower/pkg/container.dockerClient.StartContainer(0xc89b40, 0xc00030c700, 0x1, 0x920100, 0xc000485d40, 0x0, 0x1, 0xc000020100, 0xc000485d40, 0x0)
/home/ubuntu/watchtower/pkg/container/client.go:163 +0x86
github.com/containrrr/watchtower/internal/actions.restartStaleContainer(0x7faf5b8d0100, 0xc000485d40, 0x0, 0xc836e0, 0xc00000ee40, 0xc00002f960, 0x0, 0x2540be400, 0x0)
/home/ubuntu/watchtower/internal/actions/update.go:121 +0xdd
github.com/containrrr/watchtower/internal/actions.restartContainersInSortedOrder(0xc0003e2420, 0x1, 0x1, 0xc836e0, 0xc00000ee40, 0xc00002f960, 0x0, 0x2540be400, 0x0)
/home/ubuntu/watchtower/internal/actions/update.go:96 +0x255
github.com/containrrr/watchtower/internal/actions.Update(0xc836e0, 0xc00000ee40, 0xc00002f960, 0x0, 0x2540be400, 0x0, 0x1abab3a6, 0x2000000030001)
/home/ubuntu/watchtower/internal/actions/update.go:53 +0x369
github.com/containrrr/watchtower/cmd.runUpdatesWithNotifications(0xc00002f960)
/home/ubuntu/watchtower/cmd/root.go:211 +0xb3
github.com/containrrr/watchtower/cmd.runUpgradesOnSchedule.func1()
/home/ubuntu/watchtower/cmd/root.go:168 +0xb6
github.com/robfig/cron.FuncJob.Run(0xc000448100)
/home/ubuntu/go/pkg/mod/github.com/robfig/cron@v0.0.0-20180505203441-b41be1df6967/cron.go:92 +0x25
github.com/robfig/cron.(*Cron).runWithRecovery(0xc0002c8500, 0xc6dde0, 0xc000448100)
/home/ubuntu/go/pkg/mod/github.com/robfig/cron@v0.0.0-20180505203441-b41be1df6967/cron.go:165 +0x59
created by github.com/robfig/cron.(*Cron).run
/home/ubuntu/go/pkg/mod/github.com/robfig/cron@v0.0.0-20180505203441-b41be1df6967/cron.go:199 +0x76a
The container is stopped and restarted with the new version of the base image.
The container is stopped, but the program panics while trying to restart the container which fails.
Ubuntu 20.04.1 LTS running on an AWS EC2 instance. Docker server version: 19.03.12 Golang version: go1.15 linux/amd64
Any advice?
Thank you!
Yeah, this is because of this: https://github.com/containrrr/watchtower/pull/612
You can base it on that branch to get started, or I will get it merged to master tomorrow!
@piksel Thank you for the information! I'll get started with that branch to test my changes. I can make a PR for the changes once that branch is merged into master.
I know it's been a while. :) Likely there has been no progress but it doesn't hurt to ask.
This can really be a tough issue. We have cloud hosts were a service container may have 50+ instances so downtime can be verrrry long waiting for all of them to shut down first. We're not a Go shop or we'd jump in, but hopefully someone has the skills. We would absolutely help test.
This can really be a tough issue. We have cloud hosts were a service container may have 50+ instances so downtime can be verrrry long waiting for all of them to shut down first. We're not a Go shop or we'd jump in, but hopefully someone has the skills. We would absolutely help test.
We started to use "ouroboros", another container update solution to avoid this same matter. It is working as intended for us. I've not tried watchtower in a couple years - so I don't know if they fixed or changed that.
Let's say we have 10 containers based on the same image. Upon update watchtower will:
This causes downtime of N * (time to stop and start a container) - where N is the number of containers.
It would be nice if watchtower had an algorithm to:
Is it possible? Is it a planned feature? Is it a known issue?