Closed quantranhong1999 closed 2 weeks ago
1st build -> green
2nd run -> seems hang but not the previous error
3rd run -> failed
[ERROR] Errors:
[ERROR] com.linagora.tmail.james.DistributedLinagoraCalendarEventAcceptMethodTest.null
[ERROR] Run 1: DistributedLinagoraCalendarEventAcceptMethodTest » AllNodesFailed Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=192.168.0.1/<unresolved>:54132, hostId=null, hashCode=46384550): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s68|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (io.netty.channel.StacklessClosedChannelException)]
[ERROR] Run 2: DistributedLinagoraCalendarEventAcceptMethodTest » AllNodesFailed Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=192.168.0.1/<unresolved>:54132, hostId=null, hashCode=7c3c9e19): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s74|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (io.netty.channel.StacklessClosedChannelException)]
It seems retrying for tests was not useful. Cassandra still can not work properly.
BTW I spot that the build on node ci-james-03
usually fails, while it succeeds on ci-james-06
.
I would try to lower the forks usage to see...
1st build (ci-james-06) with disable reuseForks: [INFO] Team-mail :: Integration Tests :: JMAP :: Distributed SUCCESS [20:19 min]
can we test with reuseFork
= true, forkCount
=1 ?
can we test with reuseFork = true, forkCount=1 ?
there you go https://github.com/linagora/tmail-backend/pull/1102/commits/3fa6aa4ea9111568e39e3388685f9e005b7fa81b
2nd build (on ci-james-03): [INFO] Team-mail :: Integration Tests :: JMAP :: Distributed SUCCESS [20:07 min]
It seems to be more stable.
I could not reproduce https://james-jenkins.lin-saas.com/blue/rest/organizations/jenkins/pipelines/Tmail%20build/branches/PR-1098/runs/2/log/?start=0 on my local so it is a bit hard to debug.
From what I tried to reproduce: likely concurrent access to
DockerCassandraSingleton
(e.g. one tries to stop and another tries to start) leads to the initialize static block ofDockerCassandraSingleton
class failing withExceptionInInitializerError
->DockerCassandraSingleton
class can not be created properly -> test hangs.The solution (hopefully it works) is to escape hanging tests upon
ExceptionInInitializerError
and retry them.