Closed GoogleCodeExporter closed 9 years ago
Attached is a patch with the following changes:
1. Upgrades H2 to use Lucene 3.0.0.
2. Caches IndexSearchers for each path so that it can be reused by index
searches.
Original comment by mbis...@gmail.com
on 8 Dec 2009 at 3:12
Attachments:
Sorry for the delay.
Does your patch require Lucene 3.0? If yes,
I would like to wait a bit longer (Lucene 3 is relatively new).
Not much, but maybe a week or so.
Original comment by thomas.t...@gmail.com
on 15 Dec 2009 at 4:52
I think this should wait a bit longer than that. The new single Reader
interface in
3.0 that this patch uses is pretty new, and I am not completely sure I am using
it
correctly. I have been getting results that I don't expect sometimes in my
application when compared to Lucene 2.2.
Original comment by mbis...@gmail.com
on 16 Dec 2009 at 7:38
This patch includes the correct usage of IndexWriter.getReader(). My testing
process was as follows:
1. Run an app that inserts rows continually; these rows contain indexable
fields.
2. While that app is running, use the h2 console to run 'SELECT * FROM
FTL_SEARCH(...)' with a query that many
of the rows would have a hit generated.
3. Keep running the query in the h2 console while adds progress. The result
list grew as the number of rows
grew.
Original comment by mbis...@gmail.com
on 20 Dec 2009 at 4:35
Attachments:
> My testing process was as follows
If it's not too much to ask, could you create a test case? Just a simple
standalone
Java class, the assertions don't have to be very complicated... Something like
http://h2database.googlecode.com/svn/trunk/h2/src/test/org/h2/test/db/TestFullTe
xt.java
Original comment by thomas.t...@gmail.com
on 20 Dec 2009 at 12:35
I am working on a Lucene test; in the process I will remove the lucene flags in
TestFullText so that it doesn't use Lucene but rather tests the H2 fulltext
feature.
Original comment by mbis...@gmail.com
on 23 Dec 2009 at 7:41
I have a question about transactions and indexing. When does the trigger fire?
Should
the FullText class be aware that it is in a transaction? If so, does it need to
roll
back changes if the transaction fails?
Original comment by mbis...@gmail.com
on 23 Dec 2009 at 7:42
Hi,
Sorry for the delay.
> I will remove the lucene flags in TestFullText
That's OK, however a new test case may be simpler.
> When does the trigger fire?
When the row is inserted/updated/deleted (before the commit). A transaction
rollback
will also rollback the operations that were done within the trigger. This is not
documented yet, I will do that.
> Should the FullText class be aware that it is in a transaction?
> does it need to roll back changes if the transaction fails
Usually no and no. Currently triggers don't provide a way to detect rollback.
Actually this is a problem for the Lucene fulltext search... I will try to find
a
solution. It is not a problem for the 'native' fulltext search however.
Original comment by thomas.t...@gmail.com
on 30 Dec 2009 at 11:46
The rollback problem with the Lucene fulltext search will be fixed in the next
release. The trigger will also be called on rollback (this is a new feature in
the
CREATE TRIGGER syntax).
Original comment by thomas.t...@gmail.com
on 30 Dec 2009 at 1:11
I have attached a new patch that contains an updated FullTextTest. The big
changes in
the test are to add a search while inserting in the multithreaded test.
Interestingly, the Lucene FT passes this updated test, while the native FT does
not.
One of the threads gets blocked in the query and never releases, while the other
thread completes the test. The test hangs indefinitely.
Original comment by mbis...@gmail.com
on 7 Jan 2010 at 4:25
Attachments:
Thanks! I will have a look at the patch next week.
Original comment by thomas.t...@gmail.com
on 8 Jan 2010 at 9:23
I'm sorry for the delay. I could not apply the patch, could you create a new
one that
is based on the trunk?
It would be nice to have a solution that works for both the 'old' and the 'new'
Lucene. I'm not sure if that's possible. An alternative is to wait a bit longer,
until Lucene 2.x is deprecated.
Original comment by thomas.t...@gmail.com
on 21 Mar 2010 at 11:40
I'm changing the summary from 'FullTextLucene should reuse IndexReader as much
as
possible' to 'FullTextLucene: Use Lucene 3.x'. Please tell me if that's not
accurate.
Original comment by thomas.t...@gmail.com
on 21 Mar 2010 at 11:42
Attached is a new patch aligned with tonight's trunk. I will delete the old one.
Personally I would like to see H2 1.2 released as soon as it's ready, so
introducing a change like this will pull
time away from other issues that need to be addressed before release. If
someone really wants these changes
they can patch it themselves and give it a go.
The only thing I would like to bring up, though, is that the TestFullText.java
changes I made have revealed a bug
in the native FullText. Please see Comment #10 above.
Original comment by mbis...@gmail.com
on 22 Mar 2010 at 5:46
Attachments:
Thanks! The patch looks very good. Currently I'm not sure if it's time yet
to switch to Lucene 3. What do you think?
Original comment by thomas.t...@gmail.com
on 26 Mar 2010 at 4:06
I agree that 3.0 has some issues, probably more than comfortable at the moment.
Here's the list of issues slated for 3.1:
http://issues.apache.org/jira/browse/LUCENE/fixforversion/12314025
Several leaks are in there, which is always a bad thing for a long-running
system.
Original comment by mbis...@gmail.com
on 26 Mar 2010 at 11:14
Updated patch to use latest lucene 3.0.2.
Original comment by mbis...@gmail.com
on 15 Jul 2010 at 1:07
Attachments:
Thanks for the patch! Please tell me if you think I should commit it now.
Otherwise I will wait a bit longer (probably for the next release).
Original comment by thomas.t...@gmail.com
on 26 Jul 2010 at 6:18
There are two issues to consider:
1. Lucene 2.9 supports JDK 1.4+, while Lucene 3 requires 1.5+. If H2 supports
1.4 (I think it does) then moving to Lucene 3 would require omitting FTL
support in 1.4 builds.
2. Lucene 2.9 indexes are not compatible with 3.0 indexes, so they would have
to be rebuilt. One can connect to 2.9 indexes, but that would be only
advisable to support backwards compatibility. It would be transitional, driven
by a flag indicated what version of lucene created the index.
So, if you want to add it in now, you will need to strip it out of 1.4 builds
and provide a "version" test or flag to indicate what version of index to use
for Lucene.
If you want to wait for the next release, you may have the exact same problem
set. I don't think I can decide for you. In my personal case, I would rather
see it applied now as H2 1.2 is the current development track. I would like to
get it in now to flush out issues with the community at large, as well as
provide the improved scalability that the patch offers.
Maybe put it to a vote? It would help get people using the google/p/ pages more
for bugs and comments...
Original comment by mbis...@gmail.com
on 27 Jul 2010 at 9:14
My vote is for backward compatibility with Lucene 2.9!
Original comment by victor.p...@gmail.com
on 29 Jul 2010 at 7:44
It sounds like the best we could do is support both. What about
class FullTextLucene
class FullTextLucene3 extends FullTextLucene
Original comment by thomas.t...@gmail.com
on 31 Jul 2010 at 9:16
It looks good.
Original comment by victor.p...@gmail.com
on 2 Aug 2010 at 4:53
The only use case I can think of to require 2.9 support is if an application is
reading the lucene index files on disk directly. I have an application that
does this myself, but I don't think it makes sense for H2 to support it with
different versions of FTL long-term.
I'd wait for H2 1.3 to get started and put the Lucene3 patch in then so it will
have time to ride the development effort as an early-adopter upgrade. It will
give people who use the lucene index files time to update their apps.
Original comment by mbis...@gmail.com
on 4 Aug 2010 at 4:09
Hi,
I'm currently trying to integrate the patch. What I found so far is:
In the test case, you use System.out.println("sleeping"); Thread.sleep(10000);
- is it really required to sleep for 10 seconds? I would like to avoid that
(the automated tests already take a long time...). What about only run for 10
seconds in the 'big' mode (which is only ran once).
I will try to support both Lucene 2.x and 3.x is possible; if this results in
build problems I will try to find another solution (maybe a new build target
that switches the code to Lucene 3.0; not sure yet).
Original comment by thomas.t...@gmail.com
on 17 Aug 2010 at 7:21
Hi,
I think it will be hard to support both at the same time: IndexModifier and
Hits seems to be no longer supported. I will probably have to use a compile
time switch (similar to supporting multiple Java versions). Within build.xml I
will currently only support Lucene 2.2. Switching the source (and compiling
for) Lucene 3.x will only work using build.sh / build.bat (unless you have a
simple way to support it within build.xml).
Original comment by thomas.t...@gmail.com
on 17 Aug 2010 at 7:31
I guess I would come up with a good reason to support both at this time. As I
said in comment 23, the only reason for explicitly knowing which version of
lucene was being used was to access the index files directly, which isn't
something that should be encouraged. I strongly suggest moving to Lucene 3 and
dropping Lucene 2 rather than trying to support both.
Original comment by mbis...@gmail.com
on 17 Aug 2010 at 9:50
I have now applied the patch (thanks a lot!), changed a few thing, and
committed it the trunk.
From the change log:
Lucene 3.x support was added in the source code, however it is not yet enabled
by default and is not yet supported when using the default h2 jar file. To
enable Lucene 3.x support, the source code of H2 needs to be switched using
./build.sh -Dlucene=3 switchSource, and then re-compile. To switch the source
code back use ./build.sh -Dlucene=2 switchSource (replace ./build.sh with
build.bat on Windows). The plan is to use Lucene 3 by default in H2 version
1.3.x.
Original comment by thomas.t...@gmail.com
on 18 Aug 2010 at 5:29
Support for Lucene 2.x will still be available in the in H2 1.3.x source code,
just that it will not be enabled by default (you need to switch the source
code).
> If H2 supports 1.4 (I think it does)
Not directly, but using Retroweaver is supported.
> the Lucene FT passes this updated test, while the native FT does not.
I'm sorry I didn't see this before. The problem is fixed now.
Original comment by thomas.t...@gmail.com
on 18 Aug 2010 at 5:34
Implemented in version 1.2.141 (but not enabled by default)
Original comment by thomas.t...@gmail.com
on 23 Aug 2010 at 12:46
Enabled in version 1.3.x
Original comment by thomas.t...@gmail.com
on 21 Nov 2010 at 5:52
Original comment by thomas.t...@gmail.com
on 12 Dec 2010 at 12:02
Original issue reported on code.google.com by
mbis...@gmail.com
on 8 Dec 2009 at 2:57