Closed asfimport closed 17 years ago
Michael McCandless (@mikemccand) (migrated from JIRA)
New patch (take 7).
I folded in Ning's comments (above) and Yonik's comments from
other small issues. All tests pass on Linux, OS X, win32, with either SerialMergeScheduler or ConcurrentMergeScheduler as the default.
I plan to commit in a few days time...
Ning Li (migrated from JIRA)
Access of mergeThreads in ConcurrentMergeScheduler.merge() should be synchronized.
Michael McCandless (@mikemccand) (migrated from JIRA)
Ahh, good catch. Will fix!
Ning Li (migrated from JIRA)
Hmm, it's actually possible to have concurrent merges with SerialMergeScheduler.
Making SerialMergeScheduler.merge synchronize on SerialMergeScheduler will serialize all merges. A merge can still be concurrent with a ram flush.
Making SerialMergeScheduler.merge synchronize on IndexWriter will serialize all merges and ram flushes.
Michael McCandless (@mikemccand) (migrated from JIRA)
> Hmm, it's actually possible to have concurrent merges with > SerialMergeScheduler.
This was actually intentional: I thought it fine if the application is sending multiple threads into IndexWriter to allow merges to run concurrently. Because, the application can always back down to a single thread to get everything serialized if that's really required?
Ning Li (migrated from JIRA)
> This was actually intentional: I thought it fine if the application is > sending multiple threads into IndexWriter to allow merges to run > concurrently. Because, the application can always back down to a > single thread to get everything serialized if that's really required?
Today, applications use multiple threads on IndexWriter to get some concurrency on document parsing. With this patch, applications that want concurrent merges would simply use ConcurrentMergeScheduler, no?
Or a rename since it doesn't really serialize merges?
Mark Miller (@markrmiller) (migrated from JIRA)
I have to triple check, but on first glance, my apps performance halfed using the ConcurrentMergeScheduler on a recent core duo with 2 GB RAM (As compared to the SerialMergeSceduler). Seems unexpected?
Michael McCandless (@mikemccand) (migrated from JIRA)
> Today, applications use multiple threads on IndexWriter to get some > concurrency on document parsing. With this patch, applications that > want concurrent merges would simply use ConcurrentMergeScheduler, > no?
True. OK I will make SerialMergeScheduler.merge serialized. This way only one merge can happen at a time even when the application is using multiple threads.
Michael McCandless (@mikemccand) (migrated from JIRA)
> I have to triple check, but on first glance, my apps performance > halfed using the ConcurrentMergeScheduler on a recent core duo with > 2 GB RAM (As compared to the SerialMergeSceduler). Seems unexpected?
Whoa, that's certainly unexpected! I'll go re-run my perf test.
Mark Miller (@markrmiller) (migrated from JIRA)
Looks like some anomalous tests. Last night I checked twice, but today results are: 58 to 48 in favor of Concurrent. I am going to assume my first results where invalid. Sorry for the noise and thanks for the great patch. Has passed quite a few stress tests I run on my app without any problems so far. Do both merge policies allow for a closer to constant add time or is it just the Concurrent policy?
Michael McCandless (@mikemccand) (migrated from JIRA)
> Looks like some anomalous tests. Last night I checked twice, but > today results are: 58 to 48 in favor of Concurrent. I am going to > assume my first results where invalid. Sorry for the noise and > thanks for the great patch.
OK, phew!
> Has passed quite a few stress tests I run on my app without any > problems so far.
I'm glad to hear that :) Thanks for being such an early adopter!
> Do both merge policies allow for a closer to constant add time or is > it just the Concurrent policy?
Not sure I understand the question – you mean addDocument? Yes it's only ConcurrentMergeScheduler that should keep addDocument calls constant time, because SerialMergeScheduler will hijack the addDocument thread to do its merges.
Michael McCandless (@mikemccand) (migrated from JIRA)
Attached take8, incorporating Ning's feedback plus some small refactoring and fixing one case where optimize() would do an unecessary merge.
If we factor the merge policy out of IndexWriter, we can make it pluggable, making it possible for apps to choose a custom merge policy and for easier experimenting with merge policy variants.
Migrated from LUCENE-847 by Steven Parkes, resolved Sep 18 2007 Attachments: concurrentMerge.patch, LUCENE-847.patch.txt (versions: 2), LUCENE-847.take3.patch, LUCENE-847.take4.patch, LUCENE-847.take5.patch, LUCENE-847.take6.patch, LUCENE-847.take7.patch, LUCENE-847.take8.patch, LUCENE-847.txt Linked issues:
1920
1945