The third time the build was successful. This is not critical since we now have a package but will surely be problematic during following upgrades.
I tried to reproduce locally, especially with the -race flag. This flag is not present in our build but since I suspect a race condition between several tests I thought it could make it easier to trigger. But no luck so far. I noticed you run the tests in CI with this flag so maybe you have already encountered the problem?
In the first build attempt:
For TestDeleteSelectedMailboxWithRemoteUpdateCausesDisconnect it looks like the IMAP server did not have time to process the mailbox creation before the test tried to select it.
For TestDraftScenario it also looks like the IMAP server was one step behind the test.
In the second build attempt:
TestRemoteDeletionPool was stuck and was killed by the timeout. I also notice you raised this timeout to 15min in your CI but when this is test is running fine, it runs in 0.03s, so raising the timeout should not really solve this issue.
MailBoxes are created using utils.NewRandomMailboxID() in the tests, so several independent tests should not impact each other.
I also checked the connection management (looking at runOneToOneTest() and withConnections()) and did not see any obvious bug like sharing a connection between tests, reusing a connection, forgetting to check connection closing errors, etc.
Maybe the IMAP server is not always reliably answering to requests in the same order?
Let me know if you need more details or have other ideas to try and reproduce to pinpoint the issue.
gluon was recently packaged in Debian and also synced to the Ubuntu archive.
We noticed that tests were unreliable.
Here are 2 examples of the package building, resulting in 2 different sets of tests failing:
The third time the build was successful. This is not critical since we now have a package but will surely be problematic during following upgrades.
I tried to reproduce locally, especially with the
-race
flag. This flag is not present in our build but since I suspect a race condition between several tests I thought it could make it easier to trigger. But no luck so far. I noticed you run the tests in CI with this flag so maybe you have already encountered the problem?In the first build attempt:
TestDeleteSelectedMailboxWithRemoteUpdateCausesDisconnect
it looks like the IMAP server did not have time to process the mailbox creation before the test tried to select it.TestDraftScenario
it also looks like the IMAP server was one step behind the test.In the second build attempt:
TestRemoteDeletionPool
was stuck and was killed by the timeout. I also notice you raised this timeout to 15min in your CI but when this is test is running fine, it runs in 0.03s, so raising the timeout should not really solve this issue.MailBoxes are created using
utils.NewRandomMailboxID()
in the tests, so several independent tests should not impact each other.I also checked the connection management (looking at
runOneToOneTest()
andwithConnections()
) and did not see any obvious bug like sharing a connection between tests, reusing a connection, forgetting to check connection closing errors, etc.Maybe the IMAP server is not always reliably answering to requests in the same order?
Let me know if you need more details or have other ideas to try and reproduce to pinpoint the issue.