Open ebyhr opened 1 month ago
From the error log, it seems to me that the master is really slow to start in github env so it's taking a long for master to be ready to serve RPC, that's why we see the UNIMPLEMENTED
error. We can probably set a longer timeout or try to allocate a bit more resource for the test.
alluxio.exception.status.UnavailableException: Failed to handshake with master localhost:19998 to load cluster default configuration values: UNIMPLEMENTED: Method not found: alluxio.grpc.meta.MetaMasterConfigurationService/GetConfiguration
at
@jja725 Could you run stress tests locally and send a PR? This test is very flaky. I hope you will fix it soon.
@jja725 Reopen as it happened again. Please take another look.
Error: io.trino.filesystem.alluxio.TestAlluxioFileSystem -- Time elapsed: 207.6 s <<< ERROR!
java.util.concurrent.ExecutionException: org.testcontainers.containers.ContainerLaunchException: Container startup failed for image alluxio/alluxio:2.9.5
at java.base/java.util.concurrent.CompletableFuture.wrapInExecutionException(CompletableFuture.java:345)
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:440)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2117)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:327)
at org.testcontainers.junit.jupiter.TestcontainersExtension$StoreAdapter.start(TestcontainersExtension.java:276)
at org.testcontainers.junit.jupiter.TestcontainersExtension$StoreAdapter.access$200(TestcontainersExtension.java:263)
at org.testcontainers.junit.jupiter.TestcontainersExtension.lambda$null$4(TestcontainersExtension.java:83)
at org.testcontainers.junit.jupiter.TestcontainersExtension.lambda$startContainers$5(TestcontainersExtension.java:83)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
at org.testcontainers.junit.jupiter.TestcontainersExtension.startContainers(TestcontainersExtension.java:83)
at org.testcontainers.junit.jupiter.TestcontainersExtension.beforeAll(TestcontainersExtension.java:57)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1458)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2034)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:189)
Caused by: org.testcontainers.containers.ContainerLaunchException: Container startup failed for image alluxio/alluxio:2.9.5
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:359)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:330)
at java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:831)
at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:526)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1575)
Caused by: org.rnorth.ducttape.RetryCountExceededException: Retry limit hit with exception
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:88)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:344)
... 6 more
Caused by: org.testcontainers.containers.ContainerLaunchException: Could not create/start container
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:563)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:354)
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
... 7 more
Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching '.*Primary started*
'
at org.testcontainers.containers.wait.strategy.LogMessageWaitStrategy.waitUntilReady(LogMessageWaitStrategy.java:47)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)
at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:909)
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:500)
... 9 more
https://github.com/trinodb/trino/actions/runs/11157959013/job/31013228602
@jja725 This test is still flaky. Please take another look.
@JiamingMai do you mind take a look since I would not be available recently? Probably add more timeout like previous fix
https://github.com/trinodb/trino/actions/runs/11713967184/job/32627752858 @JiamingMai @jja725
https://github.com/trinodb/trino/actions/runs/11073808274/job/30771180097