hyperledger-archives / aries-framework-dotnet

Aries Framework .NET for building multiplatform SSI services
https://wiki.hyperledger.org/display/aries
Apache License 2.0
84 stars 74 forks source link

Flaky unit tests make CI hard. #45

Closed ryjones closed 4 years ago

ryjones commented 4 years ago

Describe the bug There are a number of unit tests which fail intermittently.

To Reproduce I ran a test for a while, and these are the tests that failed, with counts. Here is the script; I ran it on a dedicated Mac Mini for 119 iterations.

Expected behavior Unit tests are not flaky.

root# system_profiler -detailLevel basic SPHardwareDataType Hardware:

Hardware Overview:

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i5
  Processor Speed: 3 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 9 MB
  Memory: 8 GB
  Boot ROM Version: 1037.60.58.0.0 (iBridge: 17.16.12551.0.0,0)
  Serial Number (system): C07Z90KBJYVX
  Hardware UUID: 7AE96617-F7BD-5D77-97AF-9BD913B87B37
  Activation Lock Status: Disabled
tmarkovski commented 4 years ago

Wow, thanks for the exhaustive test @ryjones Do you happen to have the details of the fails? I'm curious if most errors are 309 SDK errors, which is what has been an ongoing intermittent fail related. This error happens when unit test moves faster than the running ledger instance is able to confirm the schema/cred_def.

tmarkovski commented 4 years ago

Nevermind, I found the raw logs in the repo. I'll take a look at this today.

ryjones commented 4 years ago

They are 100% 309 SDK errors.

tmarkovski commented 4 years ago

That's encouraging. We'll add some artificial delay in the unit tests to have the ledger catch up with it.

ryjones commented 4 years ago

I added @michaeldboyd as an org owner over here: https://github.com/hyperledger-cicd/aries-framework-dotnet so you can iterate quickly and see what Azure needs in terms of timings

ryjones commented 4 years ago

@tmarkovski Is a 307 different than a 309, in terms of root cause?

tmarkovski commented 4 years ago

@tmarkovski Is a 307 different than a 309, in terms of root cause?

Yes. 307 generally implies that ledger is unavailable, most commonly due to docker network configuration, unavailable ports, no wifi, etc.