Open offlinehacker opened 5 years ago
It now seems if i re-trigger build with a new commit, failed and stalled jobs are shared with previous build, and now CI is stuck. Take a look here: https://hercules-ci.com/github/xtruder/kubenix/jobs/8 and here: https://hercules-ci.com/github/xtruder/kubenix/jobs/9
We've had to delay features that would recover this. I've manually reset your tasks.
This is an annoying one. We don't yet have "agent pings" that would allow us to see agent liveliness.
We're going to add the "Cancel" button to be able to manually recover, but the automatic fix with agent liveliness is scheduled for next sprint.
It would be also nice to see logs after job is canceled
There won't be logs, because if job didn't report build finished event and it's cancelled there won't be any logs to show. This will change once streaming of logs #17 is implemented.
Note that "cancel" workaround is planned for sprint #4 so expect a fix soon. We had to postpone agent liveliness for another sprint or two.
We do reschedule if agent shuts down, but we don't yet handle the case of forceful shutdown - those will have some kind of a timeout.
I had job parallelism set too high and my server became unresponsive, after server reset and redeployed with lowered parallelism, jobs do not seem to restart. If i click
retry
nothing happens and response is404
.