Open andreeaflorescu opened 1 year ago
@andreeaflorescu can we increase the timeout in CI to, lets say, 10-15 minutes ? So they don't time out ? At least until the issue is fixed ?
@vireshk yes, we can do that. I have a PR that fixes some other things (https://github.com/rust-vmm/rust-vmm-ci/pull/116) as well that might be causing the timeout, I will try my best to get it ready by the end of the day. In any case, in the same PR I will also be increasing the timeout until we figure out what is going on.
The problem might be related to the host on which the CI is running. The problem seems oddly similar to the one reported here: https://lore.kernel.org/lkml/Y38h9oe4ZEGNd7Zx@quatroqueijos.cascardo.eti.br/T/#m3efc3916c892c9cca270d3ef6bfea780a2033c8e
What is strange though is that we are not using Linux 5.4, but Linux 5.15 on Ubuntu. This needs further investigation.
We temporarily increased the CI timeout per test to 15 minutes, this hopefully is enough to stop seeing the failures. To see the change in timeout, rust-vmm-ci needs to be updated. Updating rust-vmm-ci submodule is done weekly/monthly by dependabot. In case you want to do it manually, you can follow the runbook here: https://github.com/rust-vmm/community/blob/main/CONTRIBUTING.md#updating-the-rust-vmm-ci
Recently tests are taking a really long time to finish (i.e. more than 5 minutes). Because of this the CI for various components is blocked and tests appear as failed because of timeout.
We did not yet identify a root cause, but so far this seems to be the case with all the CI hosts (including the ones running MSHV).
Some issues that we are aware of are: