Closed mikemccand closed 5 years ago
Simple patch, I didn't move any code around, just removed the external dep.
[Legacy Jira: Robert Muir (@rmuir) on Jun 18 2019]
+1 if people have more precise normalization requirements, they can encode them in their dictionary – I think we can presume this is not noisy user data, and should already have been cleaned.
[Legacy Jira: Michael Sokolov (@msokolov) on Jun 18 2019]
If there are no objections I will wait until LUCENE-8863 is merged. The patch here poached some build changes from Mike S's PR for LUCENE-8863 because I needed to run test-tools.
[Legacy Jira: Robert Muir (@rmuir) on Jun 20 2019]
Commit 91331d1a891d76173f6854287f11821e6ab41fae in lucene-solr's branch refs/heads/master from Robert Muir https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=91331d1
LUCENE-8866: remove kuromoji/tools dependency on ICU
[Legacy Jira: ASF subversion and git services on Jun 21 2019]
Commit 2adc8c6c13d1a74c3a371c2341a05507e893dabf in lucene-solr's branch refs/heads/branch_8x from Robert Muir https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2adc8c6
LUCENE-8866: remove kuromoji/tools dependency on ICU
[Legacy Jira: ASF subversion and git services on Jun 21 2019]
Closing after the 8.2.0 release
[Legacy Jira: Ignacio Vera (@iverase) on Jul 26 2019]
The tooling stuff has an off-by-default option to normalize entries, currently using the ICU api.
But I think since its off-by-default, and just doing NFKC normalization at dictionary-build-time, its a better tradeoff to use the JDK here?
I would rather remove the ICU dependency for the tooling and look at simplifying the build to have less modules (e.g. investigate moving the tooling and tests into src/java and src/tools, so that [~msokolov@gmail.com] new tests in LUCENE-8863 are running by default, dictionary tool is shipped as a commandline tool in the JAR, etc)
"ant regenerate" should be enough to prevent any chicken-and-eggs in the dictionary construction code, so I don't think we need separate modules to enforce it.
Legacy Jira details
LUCENE-8866 by Robert Muir (@rmuir) on Jun 18 2019, resolved Jun 21 2019 Attachments: LUCENE-8866.patch