apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.68k stars 1.03k forks source link

Create example code for core [LUCENE-3550] #4624

Open asfimport opened 13 years ago

asfimport commented 13 years ago

Trunk has gone under lots of API changes. Some of which are not trivial, and the migration path from 3.x to 4.0 seems hard. I'd like to propose some way to tackle this, by means of live example code.

The facet module implements this approach. There is live Java code under src/examples that demonstrate some well documented scenarios. The code itself is documented, in addition to javadoc. Also, the code itself is being unit tested regularly.

We found it very difficult to keep documentation up-to-date – javadocs always lag behind, Wiki pages get old etc. However, when you have live Java code, you're forced to keep it up-to-date. It doesn't compile if you break the API, it fails to run if you change internal impl behavior. If you keep it simple enough, its documentation stays simple to.

And if we are successful at maintaining it (which we must be, otherwise the build should fail), then people should have an easy experience migrating between releases. So say you take the simple scenario "I'd like to index documents which have the fields ID, date and body". Then you create an example class/method that accomplishes that. And between releases, this code gets updated, and people can follow the changes required to implement that scenario.

I'm not saying the examples code should always stay optimized. We can aim at that, but I don't try to fool myself thinking that we'll succeed. But at least we can get it compiled and regularly unit tested.

I think that it would be good if we introduce the concept of examples such that if a module (core, contrib, modules) have an src/examples, we package it in a .jar and include it with the binary distribution. That's for a first step. We can also have meta examples, under their own module/contrib, that show how to combine several modules together (this might even uncover API problems), but that's definitely a second phase.

At first, let's do the "unit examples" (ala unit tests) and better start with core. Whatever we succeed at writing for 4.0 will only help users. So let's use this issue to:

  1. List example scenarios that we want to demonstrate for core
  2. Building the infrastructure in our build system to package and distribute a module's examples.

Please feel free to list here example scenarios that come to mind. We can then track what's been done and what's not. The more we do the better.


Migrated from LUCENE-3550 by Shai Erera (@shaie), updated Oct 06 2021 Attachments: LUCENE-3550.patch, LUCENE-3550-sort.patch

asfimport commented 13 years ago

Steven Rowe (@sarowe) (migrated from JIRA)

+1

lucene/contrib/demo/ is an existing lucene-core example, and should be folded into this effort.

About release jar naming: we could call them lucene-<module>-example, e.g. lucene-core-example-X.Y.jar, lucene-facet-example-X.Y.jar, etc.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Hi -

I would like to start my work on this issue. Request for your guidance.

Cheers -Mandy (Linked in - http://www.linkedin.com/pub/manpreet-singh/16/67a/165)

asfimport commented 11 years ago

Shai Erera (@shaie) (migrated from JIRA)

Hi Mandy. The basic idea behind this issue was to create some example code which demonstrates different scenarios of indexing with Lucene. With Lucene 4.0 came many changes to the API and such example code was badly missing (luckily, there was good migration document).

The facets module has such example code which:

At the time I thought that it would be good to follow that practice for Lucene core, ensuring that when APIs change / features removed, we update the corresponding example code on one hand, but also have the chance to evaluate the change, against real code.

Lucene has a 'demo' module, so we should put the examples code under it. Let's start by defining some use cases that we'd like to demo, e.g.:

Let's start with these, and then we can build more.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Thanks Shai. I have started work on the above examples.

I could see with latest changes even facets examples are moved under 'demo' module.

Cheers -Mandy

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

patch for 8550 [includes only SimpleExample testcase]

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Hi Shai - I have created the first patch which includes SimpleExample testcase. Request your review.

Thanks -MS

asfimport commented 11 years ago

Shai Erera (@shaie) (migrated from JIRA)

Ok I will review. But can you please rename the patch to LUCENE-3550 (and not 8550)?

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Renamed to 3550.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Hi Shai - Did you get chance to review.

asfimport commented 11 years ago

Shai Erera (@shaie) (migrated from JIRA)

Hi Mandy. I realize you followed the facets example "exactly" :). I recently simplified them a lot, and that's what I think you should do with the simple example.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Hi Shai - Thanks. Thats true :)

Thanks -MS

asfimport commented 11 years ago

Shai Erera (@shaie) (migrated from JIRA)

Ok great. Also, if you can, please create the patch on 'trunk' and not 4x.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Surely I will do that. Thanks.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Patch for Example Code

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Hi Shai - created the patch for 3550. Kindly review.

Thanks -MS

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Patch for Lucene-3550

asfimport commented 11 years ago

Shai Erera (@shaie) (migrated from JIRA)

Few comments:

As a start, it looks great. I think though that it would be better if our simple example contained: ** Documents with more than one field, to show different Field types (TextField, StringField, DocValuesField) ** Instead of a single search(), have different searchXYZ methods, e.g. *** searchKeyword (using default field), searchFields (execute fielded search) *** searchBooleanQuery, searchRangeQuery to show QueryParser's syntax *** searchSort to sort results

I consider these simple/basic examples, since that's really the essence of Lucene – index documents with few fields and querying for them in different ways.

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Perfect & Noted.

I shall follow the review comments & make the changes accordingly. Thanks again for your help & review.

regards -ms

asfimport commented 11 years ago

Aleksandra Wozniak (migrated from JIRA)

Hi,

recently I started learning Lucene API and I along the way created a few snippets showing different Lucene features. I found this issue by coincidence and I decided to rework one of them to fit into the examples implementation – I'm sending a patch with my sort example + a corresponding unit test.

Manpreet, I see that you started working on this issue a while ago – I don't want to interfere with your work. You can incorporate my example in your code or use it in any other way, if you find it useful.

Cheers, Aleksandra

asfimport commented 11 years ago

Manpreet (migrated from JIRA)

Hi Aleksandra -

I have been away from it for a while.

Resuming my work from this week. Sure I will do that.

Thanks -Manpreet