firecracker-microvm / firecracker-containerd

firecracker-containerd enables containerd to manage containers as Firecracker microVMs
Apache License 2.0
2.1k stars 180 forks source link

Investigate Buildkite deadlock #570

Open Kern-- opened 2 years ago

Kern-- commented 2 years ago

When we have Buildkite jobs running for two PRs at the same time, we occasionally see both getting stuck waiting for the concurrency group before cleaning up the thin-pools. Neither moves forward until one is cancelled.

e.g. https://buildkite.com/firecracker-microvm/firecracker-containerd/builds/2494 https://buildkite.com/firecracker-microvm/firecracker-containerd/builds/2496

ginglis13 commented 2 years ago

https://buildkite.com/firecracker-microvm/firecracker-containerd/builds/2495 was also stuck for 3hrs, I have cancelled it so that the build for #562 runs.

kzys commented 2 years ago

I have removed "loop-device test" concurrency group in #625 but another concurrency group "stress" is being introduced by #642.