Open asfimport opened 13 years ago
Steven Rowe (@sarowe) (migrated from JIRA)
+1
lucene/contrib/demo/
is an existing lucene-core example, and should be folded into this effort.
About release jar naming: we could call them lucene-<module>-example, e.g. lucene-core-example-X.Y.jar, lucene-facet-example-X.Y.jar, etc.
Manpreet (migrated from JIRA)
Hi -
I would like to start my work on this issue. Request for your guidance.
Cheers -Mandy (Linked in - http://www.linkedin.com/pub/manpreet-singh/16/67a/165)
Shai Erera (@shaie) (migrated from JIRA)
Hi Mandy. The basic idea behind this issue was to create some example code which demonstrates different scenarios of indexing with Lucene. With Lucene 4.0 came many changes to the API and such example code was badly missing (luckily, there was good migration document).
The facets module has such example code which:
At the time I thought that it would be good to follow that practice for Lucene core, ensuring that when APIs change / features removed, we update the corresponding example code on one hand, but also have the chance to evaluate the change, against real code.
Lucene has a 'demo' module, so we should put the examples code under it. Let's start by defining some use cases that we'd like to demo, e.g.:
Let's start with these, and then we can build more.
Manpreet (migrated from JIRA)
Thanks Shai. I have started work on the above examples.
I could see with latest changes even facets examples are moved under 'demo' module.
Cheers -Mandy
Manpreet (migrated from JIRA)
patch for 8550 [includes only SimpleExample testcase]
Manpreet (migrated from JIRA)
Hi Shai - I have created the first patch which includes SimpleExample testcase. Request your review.
Thanks -MS
Shai Erera (@shaie) (migrated from JIRA)
Ok I will review. But can you please rename the patch to LUCENE-3550 (and not 8550)?
Manpreet (migrated from JIRA)
Renamed to 3550.
Manpreet (migrated from JIRA)
Hi Shai - Did you get chance to review.
Shai Erera (@shaie) (migrated from JIRA)
Hi Mandy. I realize you followed the facets example "exactly" :). I recently simplified them a lot, and that's what I think you should do with the simple example.
Manpreet (migrated from JIRA)
Hi Shai - Thanks. Thats true :)
Thanks -MS
Shai Erera (@shaie) (migrated from JIRA)
Ok great. Also, if you can, please create the patch on 'trunk' and not 4x.
Manpreet (migrated from JIRA)
Surely I will do that. Thanks.
Manpreet (migrated from JIRA)
Patch for Example Code
Manpreet (migrated from JIRA)
Patch for Lucene-3550
Shai Erera (@shaie) (migrated from JIRA)
Few comments:
Please remove @author
tags. We don't use them as well as the build fails if it finds any.
In general, I think that the code needs to be more documented, since this is an example code. So for instance I would add:
If there's nothing special to say about an exception that is thrown, can you please remove @throws
from javadocs?
addDocs:
Currently the code prints messages, which we try to avoid (e.g. during tests). So either we add to DemoConstants a VERBOSE property that is initialized to System.getProperty("tests.verbose"), or you just move all the prints to main()?
ScoreDoc[]
which main() can use to print results as well as tests could use to assert on.In order to better test the example, I would make it take a Directory (e.g. index(Directory), search(Directory) or SimpleCoreExample(Directory)) and pass from tests newDirectory() (note: there's no space intentionally).
Also, I think that the example should better clarify that we don't e.g. care about casing, so for instance if you index "Apache" search for "apache".
As a start, it looks great. I think though that it would be better if our simple example contained: ** Documents with more than one field, to show different Field types (TextField, StringField, DocValuesField) ** Instead of a single search(), have different searchXYZ methods, e.g. *** searchKeyword (using default field), searchFields (execute fielded search) *** searchBooleanQuery, searchRangeQuery to show QueryParser's syntax *** searchSort to sort results
I consider these simple/basic examples, since that's really the essence of Lucene – index documents with few fields and querying for them in different ways.
Manpreet (migrated from JIRA)
Perfect & Noted.
I shall follow the review comments & make the changes accordingly. Thanks again for your help & review.
regards -ms
Aleksandra Wozniak (migrated from JIRA)
Hi,
recently I started learning Lucene API and I along the way created a few snippets showing different Lucene features. I found this issue by coincidence and I decided to rework one of them to fit into the examples implementation – I'm sending a patch with my sort example + a corresponding unit test.
Manpreet, I see that you started working on this issue a while ago – I don't want to interfere with your work. You can incorporate my example in your code or use it in any other way, if you find it useful.
Cheers, Aleksandra
Manpreet (migrated from JIRA)
Hi Aleksandra -
I have been away from it for a while.
Resuming my work from this week. Sure I will do that.
Thanks -Manpreet
Trunk has gone under lots of API changes. Some of which are not trivial, and the migration path from 3.x to 4.0 seems hard. I'd like to propose some way to tackle this, by means of live example code.
The facet module implements this approach. There is live Java code under src/examples that demonstrate some well documented scenarios. The code itself is documented, in addition to javadoc. Also, the code itself is being unit tested regularly.
We found it very difficult to keep documentation up-to-date – javadocs always lag behind, Wiki pages get old etc. However, when you have live Java code, you're forced to keep it up-to-date. It doesn't compile if you break the API, it fails to run if you change internal impl behavior. If you keep it simple enough, its documentation stays simple to.
And if we are successful at maintaining it (which we must be, otherwise the build should fail), then people should have an easy experience migrating between releases. So say you take the simple scenario "I'd like to index documents which have the fields ID, date and body". Then you create an example class/method that accomplishes that. And between releases, this code gets updated, and people can follow the changes required to implement that scenario.
I'm not saying the examples code should always stay optimized. We can aim at that, but I don't try to fool myself thinking that we'll succeed. But at least we can get it compiled and regularly unit tested.
I think that it would be good if we introduce the concept of examples such that if a module (core, contrib, modules) have an src/examples, we package it in a .jar and include it with the binary distribution. That's for a first step. We can also have meta examples, under their own module/contrib, that show how to combine several modules together (this might even uncover API problems), but that's definitely a second phase.
At first, let's do the "unit examples" (ala unit tests) and better start with core. Whatever we succeed at writing for 4.0 will only help users. So let's use this issue to:
Please feel free to list here example scenarios that come to mind. We can then track what's been done and what's not. The more we do the better.
Migrated from LUCENE-3550 by Shai Erera (@shaie), updated Oct 06 2021 Attachments: LUCENE-3550.patch, LUCENE-3550-sort.patch