Closed swcurran closed 1 year ago
looks like a platform issue with github, all those fails (and there is a success in the middle of those runs) are docker image build fails. not that the code can't be built, but that github docker image registry or whatever is struggling.
I don’t think so. I think it is a problem with the ACA-Py backchannel and the change to the thread ID. Take a look at this page that shows just the assertion error from the test that failed: https://allure.vonx.io/allure-docker-service-ui/projects/acapy-aip10/reports/latest
Looking further, but I’m guessing it is something with that change.
I’ll run the tests locally and let you know what I find. Verify consistent results, and then if so, I’ll check a prior commit.
One thing in common with all the latest failed runs is dotnet
. I am just looking at the actions in AATH
in docker action: test-harness-findy-javascript-dotnet
Features/BasicMessage/IBasicMessageService.cs(4,50): error CS1514: { expected [/aries-framework-dotnet/src/Hyperledger.Aries/Hyperledger.Aries.csproj]
Features/BasicMessage/IBasicMessageService.cs(19,2): error CS1513: } expected [/aries-framework-dotnet/src/Hyperledger.Aries/Hyperledger.Aries.csproj]
The command '/bin/sh -c dotnet publish "DotNet.Backchannel.Master.csproj" -c Release -o /app/publish' returned a non-zero code: 1
Docker image build failed.
locally running AATH : ./manage build -a acapy -a dotnet
#15 3.013 Features/BasicMessage/IBasicMessageService.cs(4,50): error CS1514: { expected [/aries-framework-dotnet/src/Hyperledger.Aries/Hyperledger.Aries.csproj]
#15 3.013 Features/BasicMessage/IBasicMessageService.cs(19,2): error CS1513: } expected [/aries-framework-dotnet/src/Hyperledger.Aries/Hyperledger.Aries.csproj]
------
executor failed running [/bin/sh -c dotnet publish "DotNet.Backchannel.Master.csproj" -c Release -o /app/publish]: exit code: 1
Docker image build failed.
So the dotnet tests/actions have not been successful since they added BasicMessageService - 2 months ago.
I don't know what those Allure dashboards tests are and they look like a completely different level of detail so maybe all my comments are meaningless for the actual problem.
The Allure workflows are used to upload the test results to the Allure servers:
The place to look is here: https://aries-interop.info for a summary of the tests and links to Allure.
Based on every second day runs, I update that page — except when the results suddenly change, as happened in the last few days.
If you then navigate into the per-framework page (e.g. clicking on ACA-Py on the main page), you can see the per runset results, and from their navigate to allure to see the results from the last 10 runs.
You’ll see, for example, that the ACA-Py to ACA-Py (runset “acapy-aip10”) suddenly started getting failures two test runs ago. Those are the ones I’m interested in — why are the ACA-Py to ACA-Py tests failing? I’ve just tried to run locally the “main” and “0.8.2-rc2” branches and they fail the same way. Trying 0.8.1 as I type. I was sure it was going to be one of the two most recent merges, but evidently not…
About .NET — yes, it has been failing for some time. I’m planning on seeing if we can drop it entirely from AATH.
Interesting…0.8.1 is passing, 0.8.2-rc2 has the failures. Sigh. I’ll try to narrow it to a merge.
thanks @WadeBarnes and @swcurran for the context, that https://aries-interop.info/ is super helpful with understanding what is going on.
OK — after some messing around with Docker, I’ve confirmed that #2261 is the change that broke the tests. Doesn’t mean that it is wrong — it could be in the Backchannel.
Process I used was to:
requirements-main.yml
file to be the particular branch or commit of interest.docker image rm acapy-main-agent-backchannel:latest
./manage runset acapy-aip10 -r
(-r
is rebuild)
./manage build -a acapy-main; ./manage run -d acapy-main -t @T006-RFC0037 -t @AIP10 -t @minor -t @AcceptanceTest -t @Schema_Health_ID -t @Indy -t @ProofProposal
aries-cloudagent-python@main
and commit aries-cloudagent-python@88769c9a3e6044ca4b22f08d83520f1553c2f97e
aries-cloudagent-python@0.8.2-rc0
@usingtechnology — can you please take a look? FYI — with AATH, the logs are at .logs
.
I’m sure there are ways to debug, but I don’t know them...
ok, thanks for the process. i'll dig in.
Scanning the logs that are passing and failing, and I’m not seeing anything. My bet is that the backchannel is expecting the empty ~thread
item, but I can’t see it in the logs :-). I assume it is Bob that would be having the problem, but who knows :-).
@swcurran - do you have a set of tests that I can run to hit all the other failures? running it wide open takes too long, and there do appear to be irrelevant failures.
If you run ./manage runset acapy-aip10
, all the tests should pass — they had been before this change. Takes a long time, but you can do other things, hopefully (that’s why I have two machines :-) ).
With runset
, you can add a -b build or -r rebuild to the end of the command.
Thanks for the runset
information. I have added a PR to AATH.
Fix was very simple but time-consuming to track down and regression test. But I know a lot more about AATH now!
Nice work! Closing this. Thanks.
For the last two runs of AATH, a number of tests are failing that had been working before. Please investigate the runs to determine the source of the failures and fix what is needed to address the problem (ACA-Py, ACA-Py backchannel, the tests, etc.).