openwallet-foundation / owl-agent-test-harness

Aries agent test framework, with agent backchannel support
https://aries-interop.info
Apache License 2.0
60 stars 66 forks source link

Test runs failing when run from the "test-harness-runner" GHA #835

Closed swcurran closed 5 months ago

swcurran commented 5 months ago

At least some of the runsets that work when run individually (e.g., acapy-aip10, acapy-aip20) fail when invoked from the "test-harness-runner" GHA. Some investigation is needed for that, or we need to look at another strategy -- e.g. running the tests on some other trigger? Manually?

FYI @nodlesh and @WadeBarnes . Sheldon, you know about this and have investigated, but no luck. Wade, it would be good if you could add this to your list to take a peek.

The errors are odd. It appears that a runset (e.g. aca-aip10) will start out OK, then a couple of tests fail, a couple more work, and then all of the fail. I'm trying to look for patterns. And of course, because the tests all pass when triggered manually (GitHub UI), it is unlikely that it is other than environmental.

There are a number of notices about deprecated node engines being used in places, and perhaps that impacts the issue. However, one would expect that to be a hard error and we'd have the same error when triggering the tests manually.

swcurran commented 5 months ago

I think the issue is that GitHub is limiting the resources. I ran a bunch of tests manually last night and they started to fail. Ran them again this morning and they worked. So, limiting the resources of the runs would be good:

swcurran commented 5 months ago

To be more precise -- the agents using acapy-main fail to start in the examples I have looked at.

Starting Acme Agent using javascript-agent-backchannel ...
Starting Bob Agent using acapy-main-agent-backchannel ...
Starting Faber Agent using javascript-agent-backchannel ...
Starting Mallory Agent using javascript-agent-backchannel ...

waiting for Acme agent to start........
waiting for Bob agent to start...............................
The agent failed to start within 30 seconds.

waiting for Faber agent to start
waiting for Mallory agent to start
swcurran commented 5 months ago

So after re-running most of the ACA-Py runsets manually, I have the number of passing tests up to the same as it was weeks ago (222 / 310). So it is clearly an issue with the GHA runs causing problems. Frustrating. We need a need approach...

nodlesh commented 5 months ago

It seems to me that the more the runsets are executing parallel, the more prone to failure they are. Notice in the two comparisons below. The acapy runset (which is not included in the interop results) runs by itself most of the time. It passes most of the time.

acapy aip20 - 6/16/2024 20:18:58 - 20:46:01 (27m 03s) - Passed (3 failed)
acapy aip10 - 6/16/2024 20:19:07 - 20:28:39 (9m 31s) - Failed
acapy afj - 6/16/2024 20:20:44 - 20:28:49 (8m 05s) - Failed acapy - 6/16/2024 20:44:23 - 21:24:20 (39m 57s) - Passed

acapy aip20 6/14/2024 20:17:45 - 20:23:15 (5m 30s) Failed acapy aip10 6/14/2024 20:18:04 - 20:22:49 (4m 45s) Failed acapy afj 6/14/2024 20:21:38 - 20:21:38 (395ms) Failed acapy 6/14/2024 20:34:42 - 21:14:36 (39m 53s) Passed

Maybe the first step would be to lower the max-parallel to 1 to see if things clear up, then start to increase it until we see an issue. Maybe it will work fine with 2 or 3. Worst case scenario we run one at a time. Currently it is set to 4.

    strategy:
      max-parallel: 4
swcurran commented 5 months ago

FIrst run of the “parallel = 1” GHA looks good, with the results roughly what we would expect. 2 ACA-Py to ACA-Py tests failed, but that is more likely a real problem vs. what we have been seeing.

Now we can focusing on getting more tests passing...