rust-vmm / community

rust-vmm community content
501 stars 28 forks source link

CI - timeouts #137

Open andreeaflorescu opened 1 year ago

andreeaflorescu commented 1 year ago

Recently tests are taking a really long time to finish (i.e. more than 5 minutes). Because of this the CI for various components is blocked and tests appear as failed because of timeout.

We did not yet identify a root cause, but so far this seems to be the case with all the CI hosts (including the ones running MSHV).

Some issues that we are aware of are:

vireshk commented 1 year ago

@andreeaflorescu can we increase the timeout in CI to, lets say, 10-15 minutes ? So they don't time out ? At least until the issue is fixed ?

andreeaflorescu commented 1 year ago

@vireshk yes, we can do that. I have a PR that fixes some other things (https://github.com/rust-vmm/rust-vmm-ci/pull/116) as well that might be causing the timeout, I will try my best to get it ready by the end of the day. In any case, in the same PR I will also be increasing the timeout until we figure out what is going on.

andreeaflorescu commented 1 year ago

The problem might be related to the host on which the CI is running. The problem seems oddly similar to the one reported here: https://lore.kernel.org/lkml/Y38h9oe4ZEGNd7Zx@quatroqueijos.cascardo.eti.br/T/#m3efc3916c892c9cca270d3ef6bfea780a2033c8e

What is strange though is that we are not using Linux 5.4, but Linux 5.15 on Ubuntu. This needs further investigation.

andreeaflorescu commented 1 year ago

We temporarily increased the CI timeout per test to 15 minutes, this hopefully is enough to stop seeing the failures. To see the change in timeout, rust-vmm-ci needs to be updated. Updating rust-vmm-ci submodule is done weekly/monthly by dependabot. In case you want to do it manually, you can follow the runbook here: https://github.com/rust-vmm/community/blob/main/CONTRIBUTING.md#updating-the-rust-vmm-ci