Closed almostintuitive closed 1 year ago
One level up we see this message:
unfortunately it looks like it's not a problem of not picking up jobs, but where hetzner is killing our workflows during runtime, so nothing to do with this library! sorry.
Ok, let me know if anything comes up. I would be happy to help.
thanks!:) actually it resolved automagically...
Hi!
We're using the library now in production, and it has been extremely useful for us! (our config is very simple: --max-runners 40, recycling on). The only problem we're facing is: when we're startuing let's say 5 or 10 in parallel, then 30-50% of the jobs are marked as "failed job".
I was trying to look at the logs but the only error message I'm finding is this one:
06:24:20 scale_down ERROR ❌ APIException: cannot perform operation because server is locked
I'm now trying to look in depth whether it's some kind of instance creation timeout, while keeping the page open before to see what is happening before github just declares it a "failed job", hopefully have an update soon!