Closed itschrispeck closed 2 weeks ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 62.17%. Comparing base (
59551e4
) to head (ef290b9
). Report is 396 commits behind head on master.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I saw this test being flaky. Do you think it is related?
Error: Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.580 s <<< FAILURE! -- in org.apache.pinot.segment.local.realtime.impl.invertedindex.LuceneMutableTextIndexTest
Error: org.apache.pinot.segment.local.realtime.impl.invertedindex.LuceneMutableTextIndexTest.testQueries -- Time elapsed: 0.289 s
Error: org.apache.pinot.segment.local.realtime.impl.invertedindex.LuceneMutableTextIndexTest.testQueryCancellationIsSuccessful -- Time elapsed: 0.032 s <<< FAILURE!
org.testng.TestException:
The exception was thrown with the wrong message: expected ".*TEXT_MATCH query timeout on realtime consuming segment.*" but got "java.util.concurrent.RejectedExecutionException: java.lang.InterruptedException"
at org.testng.internal.invokers.ExpectedExceptionsHolder.wrongException(ExpectedExceptionsHolder.java:69)
at org.testng.internal.invokers.TestInvoker.considerExceptionsInternal(TestInvoker.java:885)
at org.testng.internal.invokers.TestInvoker.considerExceptions(TestInvoker.java:860)
at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:741)
at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:230)
at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:63)
at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:992)
at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:203)
at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:154)
at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:134)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
at org.testng.TestRunner.privateRun(TestRunner.java:739)
at org.testng.TestRunner.run(TestRunner.java:614)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:421)
at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:413)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:373)
at org.testng.SuiteRunner.run(SuiteRunner.java:312)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:95)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1274)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1208)
at org.testng.TestNG.runSuites(TestNG.java:1112)
at org.testng.TestNG.run(TestNG.java:1079)
at org.apache.maven.surefire.testng.TestNGExecutor.run(TestNGExecutor.java:155)
at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.executeSingleClass(TestNGDirectoryTestSuite.java:102)
at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.execute(TestNGDirectoryTestSuite.java:91)
at org.apache.maven.surefire.testng.TestNGProvider.invoke(TestNGProvider.java:137)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.RejectedExecutionException: java.lang.InterruptedException
at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.pinot.segment.local.realtime.impl.invertedindex.LuceneMutableTextIndexTest.testQueryCancellationIsSuccessful(LuceneMutableTextIndexTest.java:105)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:141)
at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:686)
... 27 more
Caused by: java.util.concurrent.RejectedExecutionException: java.lang.InterruptedException
at org.apache.pinot.common.utils.ScalingThreadPoolExecutor$ForceQueuePolicy.rejectedExecution(ScalingThreadPoolExecutor.java:93)
at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841)
at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1376)
at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:145)
at org.apache.pinot.segment.local.realtime.impl.invertedindex.RealtimeLuceneTextIndex.getDocIds(RealtimeLuceneTextIndex.java:149)
at org.apache.pinot.segment.local.realtime.impl.invertedindex.LuceneMutableTextIndexTest.lambda$testQueryCancellationIsSuccessful$0(LuceneMutableTextIndexTest.java:103)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.InterruptedException
at java.base/java.util.concurrent.locks.ReentrantLock$Sync.lockInterruptibly(ReentrantLock.java:159)
at java.base/java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:372)
at java.base/java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:332)
at org.apache.pinot.common.utils.ScalingThreadPoolExecutor$ForceQueuePolicy.rejectedExecution(ScalingThreadPoolExecutor.java:90)
... 9 more
I don't think it's related, but is probably caused by this behavior: https://julien.ponge.org/blog/not-all-java-single-threaded-executors-are-created-equal-a-java-finalizer-horror-story/
Did you see the flakiness is Java 21? It should be fixed there, but could misbehave in the suites run with 11/17.
Do you mind if I put the fix into another PR since it's not related to the fix for the reuse feature, or want me to bundle it into this one? (there's a couple other nits I'd rather bundle together if no preference).
A general question: if we turn off merge, will it result in too many Lucene segments for realtime part and cause query perf issues?
A general question: if we turn off merge, will it result in too many Lucene segments for realtime part and cause query perf issues?
In theory it could, but in practice Pinot's typical segment sizes prevent this (>5-10GB segments isn't normal), and we don't see a query perf regression with even a small 10MB buffer size and 1.5GB segment size.
FDs are probably more of a concern for systems w/ low limits, but I think can be addressed if needed w/ lucene buffer size/compound file format/Pinot segment size tuning - all of which have existing configs.
https://github.com/apache/pinot/pull/12744 changed the Lucene index build process to commit the realtime Lucene index, and reuse it in the immutable segment.
.commit()
is called at the beginning of segment conversion, and the Lucene index directory is copied to another location. SinceIndexWriter
can merge segments in the background even without insertions/updates to the index, these merges non-deterministically can trigger aFileNotFoundException
:By using
NoMergeScheduler
, the async merges will never occur. This patch was tested in our staging/prod env and the intermittent segment build failures disappeared.tag:
bugfix