Open buffcode opened 1 year ago
After upgrading to 5.0.1 and creating all of the missing servers:
runner-ovfjcph1-runner-1700632818-61a5a758 - hetzner Error Unknown coul
d not execute drivers.MustBeRunning: could not get server by ID: limit of 5000 requests per hour for XXXX:XXXX:c0c:b1cc::1 rea
ched (rate_limit_exceeded)
runner-ovfjcph1-runner-1700639271-6e1e6488 - hetzner Error Unknown coul
d not execute drivers.MustBeRunning: could not get server by ID: limit of 3600 requests per hour reached (rate_limit_exceeded
)
Maybe this also affects which machines/states are known on both sides?
After the API being accessible again I can confirm that docker-machine and Hetzner cloud are now out of sync.
Docker reports 19 servers while Hetzner currently has 42 servers.
Hi,
sorry I came back only now, I was dealing with some medical issues.
It is indeed possible for Hetzner and the driver to get out-of-sync. docker-machine
implements a rather basic RPC protocol and the server creation logic boils down to a pre-create check (which on a best-effort basis tries to ensure the machine creation should succeed), the actual creation and then waiting for the machine to come up.
Depending on which step fails, docker-machine
may conclude the machine has not been created and decide to remove the files; the driver on the other hand only performs a tear-down during the creation steps.
Unfortunately the setup process is wonky and inherently racy. There are some options to configure retry behavior, intended specifically for dealing with rate-limiting issues, but there is still no guaranteed. The best thing I can recommend is to check the servers manually after an abnormal creation failure, perhaps tagging them beforehand so they are easier to identify.
I am myself dealing with this problem when terminating docker-machine
prematurely in development and sometimes having left-over resources (including running servers) then; it's annoying, but unfortunately for me so far the aforementioned manual way is the best thing I could come up with.
We are currently running on 4.1.0 (I will upgrade later today) and we have the problem (since multiple versions) that docker-machine creates servers but some how fails to remember those.
I recently manually deleted about 30 servers in Hetzner cloud that weren't known to
docker-machine ls
(anymore?) but definitely created this way.We are using
docker-machine
to spin up cloud runners for GitLab CI, so every runner has a fixed prefix and is easily recognizable.Is there a way to sync docker-machine with hetzner cloud, so that these servers get picked up again? Or that docker-machine recognizes those unmanaged machines and removes them? This is filling our resource limits and bills as well :)
Can I provide logs (which?) to debug this? This usually stacks up over multiple weeks and does not happen on a daily basis.