Closed austinvazquez closed 2 days ago
Hopefully this works better than the naive shutdown command 👍
From the microsoft/WSL issue, some users reported shutdown taking 2-3 minutes. We can also consider being more aggressive than this and killing the WSL service faster.
Issue #, if available: There is a known issue, https://github.com/microsoft/WSL/issues/8529, where WSL commands can hang. This can cause Windows e2e tests to block until hitting the 2 hour timeout.
Description of changes: This change adds a workaround to detect the bad state and attempt to mitigate by killing the WSL service. If the issue cannot be resolved, the test will only hang for 300 seconds before failing.
Testing done: CI run was successful with 8 WSL shutdown failures. https://github.com/runfinch/finch/actions/runs/9682445232/job/26715743040
Trade-off analysis The trade-off for this approach is the test suite can take longer with multiple reset VM calls being made. Sample runs which previously took ~15 minutes are up to ~37 minutes with the hanging mitigation; however, this is down from the 2 hour timeout failure which would occur without the mitigation.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.