mikemccand / stargazers-migration-test

Testing Lucene's Jira -> GitHub issues migration
0 stars 0 forks source link

Snowball stemmer/analyzer for the Estonian language [LUCENE-8891] #888

Closed mikemccand closed 5 years ago

mikemccand commented 5 years ago

Currently there is no Estonian specific stemmer for SnowballFilter.

I would like to add a Snowball stemmer for the Estonian language and also add a new Language analyzer for the Estonian language based on the snowball stemmer.

https://github.com/gpaimla/lucene-solr fork of master branch with the analyzer implemented


Legacy Jira details

LUCENE-8891 by Gert Morten Paimla on Jun 28 2019, resolved Jun 30 2019 Attachments: LUCENE-8891.patch (versions: 2) Linked issues:

mikemccand commented 5 years ago

Usually Lucene uses ISO 639 Language code, so I would use "et" instead of "ee" in the package name for Estonian. https://www.loc.gov/standards/iso639-2/php/langcodes_name.php?iso_639_1=et

[Legacy Jira: Tomoko Uchida (@mocobeta) on Jun 28 2019]

mikemccand commented 5 years ago

Updated package names to use ISO 639 Language codes

[Legacy Jira: Gert Morten Paimla on Jun 28 2019]

mikemccand commented 5 years ago

Thanks @gpaimla, the patch looks fine to me.

I noticed a few small things:

[Legacy Jira: Tomoko Uchida (@mocobeta) on Jun 28 2019]

mikemccand commented 5 years ago

I removed the TestEstonianStemming class and merged it into the Analyzer testing class, it was just a leftover that i forgot even existed. Hopefully the precommit task doesnt fail now either.

Although i cant run it myself because it throws into an error: 

[source-patterns] Unescaped symbol "->" on line #46: solr/solr-ref-guide/src/analytics.adoc [source-patterns] Unescaped symbol "->" on line #55: solr/solr-ref-guide/src/analytics.adoc

BUILD FAILED

[Legacy Jira: Gert Morten Paimla on Jun 28 2019]

mikemccand commented 5 years ago

OK, the patch passed precommit.

I'd like to wait for a while before committing to ASF repo, so that others can review it. If there are no objections I will commit it to the master and branch_8x on the weekend.

@gpaimla: in the meantime, you can add a change log to lucene/CHANGES.txt. It should be added to "New Features" section in "Lucene 8.2.0" updates. The credit would be "(your_name via Tomoko Uchida)".

 

 

[Legacy Jira: Tomoko Uchida (@mocobeta) on Jun 28 2019]

mikemccand commented 5 years ago

Hi @gpaimla,

I will commit the patch to the ASF repo as-is in 24 hours. Please add the change log to CHANGES.txt by then, if you'd like to write it on your own (or else I will add a short log message for this).

[Legacy Jira: Tomoko Uchida (@mocobeta) on Jun 29 2019]

mikemccand commented 5 years ago
+1 overall
Vote Subsystem Runtime Comment
Prechecks
+1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
master Compile Tests
+1 compile 1m 6s master passed
Patch Compile Tests
+1 compile 0m 29s the patch passed
+1 javac 0m 29s the patch passed
+1 Release audit (RAT) 0m 29s the patch passed
+1 Check forbidden APIs 0m 29s the patch passed
+1 Validate source patterns 0m 29s the patch passed
Other Tests
+1 unit 10m 3s common in the patch passed.
16m 34s
Subsystem Report/Notes
JIRA Issue LUCENE-8891
JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12973186/LUCENE-8891.patch
Optional Tests compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool ant
Personality /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision master / 8b72e91
ant version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018
Default Java LTS
Test Results https://builds.apache.org/job/PreCommit-LUCENE-Build/194/testReport/
modules C: lucene/analysis/common U: lucene/analysis/common
Console output https://builds.apache.org/job/PreCommit-LUCENE-Build/194/console
Powered by Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

[Legacy Jira: Lucene/Solr QA on Jun 29 2019]

mikemccand commented 5 years ago

I added a new patch with the CHANGES.txt edit

[Legacy Jira: Gert Morten Paimla on Jun 29 2019]

mikemccand commented 5 years ago

Commit 42a1eb04038a15556755a384d23dddd35b9f7843 in lucene-solr's branch refs/heads/master from Gert Morten Paimla https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=42a1eb0

LUCENE-8891: Add snowball stemmer and analyzer for Estonian language.

Signed-off-by: Tomoko Uchida <tomoko@apache.org>

[Legacy Jira: ASF subversion and git services on Jun 30 2019]

mikemccand commented 5 years ago

Commit 2df6ea2305f5df77671e07bb3ad9b999818f9910 in lucene-solr's branch refs/heads/branch_8x from Gert Morten Paimla https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2df6ea2

LUCENE-8891: Add snowball stemmer and analyzer for Estonian language.

Signed-off-by: Tomoko Uchida <tomoko@apache.org>

[Legacy Jira: ASF subversion and git services on Jun 30 2019]

mikemccand commented 5 years ago

This will be shipped with Lucene 8.2.

Thank you, @gpaimla!

[Legacy Jira: Tomoko Uchida (@mocobeta) on Jun 30 2019]

mikemccand commented 5 years ago

Closing after the 8.2.0 release

[Legacy Jira: Ignacio Vera (@iverase) on Jul 26 2019]