danielhollas commented 1 year ago

I was trying to hunt down the issue behing the GHA CI timeout failures, reported in #163.

The hypothesis was to see if issue #154 was behind this. See my last comment on #154 for details about my investigation. One think I noticed is that the asyncio echo_logs Task would not get cancelled if the container failed to start. I don't know if this is the reason for the test flakiness, but it is worth a shot I think.

EDIT: So this obviously did not help, but it still seems like a change worth doing. I've also decreased the timeout on GHA so that at least we're not wasting CI minutes.

codecov[bot] commented 1 year ago

Codecov Report

Base: 86.72% // Head: 86.72% // No change to project coverage :thumbsup:

Coverage data is based on head (123339e) compared to base (b8d4da5). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #164 +/- ## ======================================= Coverage 86.72% 86.72% ======================================= Files 9 9 Lines 904 904 ======================================= Hits 784 784 Misses 120 120 ``` | Flag | Coverage Δ | | |---|---|---| | py-3.10 | `86.61% <100.00%> (ø)` | | | py-3.8 | `86.57% <100.00%> (ø)` | | | py-3.9 | `86.68% <100.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiidalab#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/aiidalab/aiidalab-launch/pull/164?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiidalab) | Coverage Δ | | |---|---|---| | [aiidalab\_launch/\_\_main\_\_.py](https://codecov.io/gh/aiidalab/aiidalab-launch/pull/164?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiidalab#diff-YWlpZGFsYWJfbGF1bmNoL19fbWFpbl9fLnB5) | `79.93% <100.00%> (ø)` | | | [aiidalab\_launch/instance.py](https://codecov.io/gh/aiidalab/aiidalab-launch/pull/164?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiidalab#diff-YWlpZGFsYWJfbGF1bmNoL2luc3RhbmNlLnB5) | `87.17% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiidalab). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiidalab)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

danielhollas commented 1 year ago

One test keeps failing with an "out of runtime error". I suggest increasing the timeout to 15. Otherwise, good to go from my side.

If you're referring to for example https://github.com/aiidalab/aiidalab-launch/actions/runs/3896105969/jobs/6698675263, yes, the test suite randomly times out for unknown reason, as reported in #163. I've increased it to 15 minutes but this will not solve the issue as it has been happening for some time even with 30 minutes. (the idea of decreasing the timeout was to fail early in those cases).

In the link above, there's a following pytest warning:

=============================== warnings summary ===============================
tests/test_cli.py::TestInstanceLifecycle::test_start_stop_reset
  /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/asyncio/base_events.py:641: RuntimeWarning: coroutine 'Queue.put' was never awaited
    self._ready.clear()

This points to perhaps that the issue #154 is behind these timeouts, see also my last comments on that issue where I tried to analyze where it is coming from. This prompted this PR, which obviously didn't fix the issue so more investigation is needed.

Sorry for not being clear in the OP.

danielhollas commented 1 year ago

@yakutovicha can't merge, sorry. Maybe you can try to add the deployment team in this repo settings?

yakutovicha commented 1 year ago

@danielhollas, please remind me next week (I am currently at a conference). I will give my more rights to the AiiDAlab organisation.

aiidalab / aiidalab-launch

Trying to fix flaky test suite #164

Codecov Report