apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.65k stars 1.03k forks source link

regenerate kuromoji dict in regenerate [LUCENE-9866] #10905

Closed asfimport closed 3 years ago

asfimport commented 3 years ago

I saw this as a TODO in the build for kuromoji: I think we should enable it? The similar dictionary construction is enabled for nori, and we run the risk of breaking it if we don't regularly regenerate it.

AFAIK it is not especially slow, especially compared to regenerating large DFAs :)


Migrated from LUCENE-9866 by Robert Muir (@rmuir), resolved Mar 25 2021

asfimport commented 3 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Both compileMecab and compileNaist generate to the same output folder. We can run one of them as part of regenerate but not both?

asfimport commented 3 years ago

Robert Muir (@rmuir) (migrated from JIRA)

@dweiss we package mecab by default with lucene in the jar's resources. So we should regenerate that one.

the task for naist is just a way to generate an alternative substitute dictionary.

asfimport commented 3 years ago

ASF subversion and git services (migrated from JIRA)

Commit a38713907d7593a709baf93fe26fcef3372f1a4f in lucene's branch refs/heads/main from Dawid Weiss https://gitbox.apache.org/repos/asf?p=lucene.git;h=a387139

LUCENE-9866: regenerate kuromoji dict in regenerate

asfimport commented 3 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Ok, committed this in.

asfimport commented 2 years ago

Adrien Grand (@jpountz) (migrated from JIRA)

Closing after the 9.0.0 release