apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.52k stars 1.29k forks source link

Flaky Tests: TPCHQueryIntegrationTest and pinot-kafka-3.0 #14099

Open ankitsultana opened 1 month ago

ankitsultana commented 1 month ago

Saw these failures as part of #14064.

TPCH

Error:    TPCHQueryIntegrationTest.testTPCHQueries:111->testQueriesSucceed:115 » PinotClient java.util.concurrent.TimeoutException

pinot-kafka-3.0

21:49:15.240 ERROR [KafkaServer] [main] Fatal error during KafkaServer startup. Prepare to shutdown
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
    at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:257) ~[kafka_2.12-3.8.0.jar:?]
    at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:253) ~[kafka_2.12-3.8.0.jar:?]
    at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:115) ~[kafka_2.12-3.8.0.jar:?]
    at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:2317) ~[kafka_2.12-3.8.0.jar:?]
    at kafka.zk.KafkaZkClient$.createZkClient(KafkaZkClient.scala:2407) ~[kafka_2.12-3.8.0.jar:?]
    at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:742) ~[kafka_2.12-3.8.0.jar:?]
    at kafka.server.KafkaServer.startup(KafkaServer.scala:230) ~[kafka_2.12-3.8.0.jar:?]
    at org.apache.pinot.plugin.stream.kafka30.utils.MiniKafkaCluster.start(MiniKafkaCluster.java:94) ~[test-classes/:?]
    at org.apache.pinot.plugin.stream.kafka30.KafkaPartitionLevelConsumerTest.setUp(KafkaPartitionLevelConsumerTest.java:69) ~[test-classes/:?]
...
Error:  Tests run: 8, Failures: 1, Errors: 0, Skipped: 7, Time elapsed: 39.93 s <<< FAILURE! -- in org.apache.pinot.plugin.stream.kafka30.KafkaPartitionLevelConsumerBackwardCompatibilityTest
Error:  org.apache.pinot.plugin.stream.kafka30.KafkaPartitionLevelConsumerBackwardCompatibilityTest.setUp -- Time elapsed: 39.72 s <<< FAILURE!
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
    at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:257)
    at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:253)
    at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:115)
    at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:2317)
    at kafka.zk.KafkaZkClient$.createZkClient(KafkaZkClient.scala:2407)
    at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:742)
    at kafka.server.KafkaServer.startup(KafkaServer.scala:230)
    at org.apache.pinot.plugin.stream.kafka30.utils.MiniKafkaCluster.start(MiniKafkaCluster.java:94)
    at org.apache.pinot.plugin.stream.kafka30.KafkaPartitionLevelConsumerTest.setUp(KafkaPartitionLevelConsumerTest.java:69)

pinot-kafka-2.0

2024-09-27T11:41:19.469289580Z main ERROR Reconfiguration failed: No configuration found for '4e0e2f2a' at 'null' in 'null'
11:41:19.961 WARN [KafkaStreamMetadataProvider] [main] initial offset type is timestamp and its value evaluates to null hence proceeding with offset 1000 for topic foo partition 0
11:41:20.120 WARN [KafkaStreamMetadataProvider] [main] initial offset type is timestamp and its value evaluates to null hence proceeding with offset 1000 for topic bar partition 0
11:41:20.175 WARN [KafkaStreamMetadataProvider] [main] initial offset type is timestamp and its value evaluates to null hence proceeding with offset 1000 for topic bar partition 1
11:41:21.468 ERROR [Util] [main] Last transaction was partial.
11:41:21.489 ERROR [Util] [main] Last transaction was partial.
Error:  Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 60.79 s <<< FAILURE! -- in org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.testBuildConsumer -- Time elapsed: 20.50 s
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.testConsumer -- Time elapsed: 0.691 s
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.testFetchMessages -- Time elapsed: 10.02 s
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.testFetchOffsets -- Time elapsed: 0.574 s
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.testGetPartitionCount -- Time elapsed: 0.079 s
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.testOffsetsExpired -- Time elapsed: 0.118 s
Error:  org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.tearDown -- Time elapsed: 2.822 s <<< FAILURE!
org.apache.commons.io.IOExceptionList: 2 exception(s): [org.apache.commons.io.IOIndexedException: IOException #0: Cannot delete file: /tmp/EmbeddedZooKeeper/log, org.apache.commons.io.IOIndexedException: IOException #1: Cannot delete file: /tmp/EmbeddedZooKeeper/data]
    at org.apache.commons.io.IOExceptionList.checkEmpty(IOExceptionList.java:50)
    at org.apache.commons.io.function.IOStream.forAll(IOStream.java:357)
    at org.apache.commons.io.function.IOStreams.forAll(IOStreams.java:42)
    at org.apache.commons.io.function.IOStreams.forAll(IOStreams.java:36)
    at org.apache.commons.io.function.IOConsumer.forAll(IOConsumer.java:80)
    at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:367)
    at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1242)
    at org.apache.pinot.plugin.stream.kafka.utils.EmbeddedZooKeeper.close(EmbeddedZooKeeper.java:54)
    at org.apache.pinot.plugin.stream.kafka20.utils.MiniKafkaCluster.close(MiniKafkaCluster.java:102)
    at org.apache.pinot.plugin.stream.kafka20.KafkaPartitionLevelConsumerTest.tearDown(KafkaPartitionLevelConsumerTest.java:107)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
Caused by: org.apache.commons.io.IOIndexedException: IOException #0: Cannot delete file: /tmp/EmbeddedZooKeeper/log
    at org.apache.commons.io.function.IOStream.lambda$forAll$11(IOStream.java:352)
    at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:1024)
    at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)
    at org.apache.commons.io.function.IOStream.forAll(IOStream.java:343)
    ... 38 more
Caused by: java.io.IOException: Cannot delete file: /tmp/EmbeddedZooKeeper/log
    at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1392)
    at org.apache.commons.io.function.IOStream.lambda$forAll$11(IOStream.java:345)
    ... 41 more
Caused by: java.nio.file.NoSuchFileException: /tmp/EmbeddedZooKeeper/log/version-2
    at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
    at java.base/sun.nio.fs.UnixException.asIOException(UnixException.java:115)
    at java.base/sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:477)
    at java.base/java.nio.file.Files.newDirectoryStream(Files.java:481)
    at org.apache.commons.io.file.PathUtils.isEmptyDirectory(PathUtils.java:1075)
    at org.apache.commons.io.file.DeletingPathVisitor.postVisitDirectory(DeletingPathVisitor.java:139)
    at org.apache.commons.io.file.DeletingPathVisitor.postVisitDirectory(DeletingPathVisitor.java:37)
    at java.base/java.nio.file.Files.walkFileTree(Files.java:2803)
    at java.base/java.nio.file.Files.walkFileTree(Files.java:2857)
    at org.apache.commons.io.file.PathUtils.visitFileTree(PathUtils.java:1730)
    at org.apache.commons.io.file.PathUtils.deleteDirectory(PathUtils.java:518)
    at org.apache.commons.io.file.PathUtils.delete(PathUtils.java:477)
    at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1388)
    ... 42 more
abhioncbr commented 1 month ago

I added the module recently, let me look in to it. Thanks

ankitsultana commented 1 month ago

@abhioncbr : Shaurya was planning to pickup the TPCH one, does that work? @shauryachats

Also, there's an error with pinot-kafka-2.0 also. Just added it to the list.

abhioncbr commented 1 month ago

Got it. Thanks for the more context.

Jackie-Jiang commented 4 days ago

Kafka one was fixed in #14458