aparo / opensearch-analysis-ik

The IK Analysis plugin integrates Lucene IK analyzer into OpenSearch, support customized dictionary. Port of https://github.com/medcl/elasticsearch-analysis-ik
Apache License 2.0
40 stars 14 forks source link

plugin ZIP incomplete (=> doesn't work) #6

Open rursprung opened 2 years ago

rursprung commented 2 years ago

the released plugin ZIPs (i checked, this is the same down to the first release, 1.1.0) do not contain all the files required. namely, the whole config/ folder is missing and plugin-security.policy would probably also be needed.

compare the original elasticsearch plugin (which works):

$ zipinfo -1 elasticsearch-analysis-ik-7.10.2.zip
config/
config/main.dic
config/quantifier.dic
config/extra_single_word_full.dic
config/IKAnalyzer.cfg.xml
config/surname.dic
config/suffix.dic
config/stopword.dic
config/extra_main.dic
config/extra_stopword.dic
config/preposition.dic
config/extra_single_word_low_freq.dic
config/extra_single_word.dic
plugin-descriptor.properties
plugin-security.policy
elasticsearch-analysis-ik-7.10.2.jar
httpclient-4.5.2.jar
httpcore-4.4.4.jar
commons-logging-1.2.jar
commons-codec-1.9.jar

with this one (which doesn't):

$ zipinfo -1 opensearch-analisys-ik-2.0.0.zip
plugin-descriptor.properties
opensearch-analisys-ik-2.0.0.jar
httpclient-4.5.13.jar
httpcore-4.4.12.jar
NOTICE.txt
LICENSE.txt

this leads to the following errors on startup (which prevent the node from starting correctly):

{"type": "server", "timestamp": "2022-06-17T13:54:22,277Z", "level": "INFO", "component": "o.w.a.d.Dictionary", "cluster.name": "test-opensearch", "node.name": "test-opensearch-0", "message": "try load config from /opt/opensearch/config/analysis-ik/IKAnalyzer.cfg.xml", "cluster.uuid": "H7xcTvEARCy0GNHIryC15w", "node.id": "fuVqL5NwQ0SOyIpT0ju3TQ"  }
{"type": "server", "timestamp": "2022-06-17T13:54:22,284Z", "level": "ERROR", "component": "o.w.a.d.Dictionary", "cluster.name": "test-opensearch", "node.name": "test-opensearch-0", "message": "ik-analyzer: Main Dict not found", "cluster.uuid": "H7xcTvEARCy0GNHIryC15w", "node.id": "fuVqL5NwQ0SOyIpT0ju3TQ" , 
"stacktrace": ["java.io.FileNotFoundException: /opt/opensearch/config/analysis-ik/main.dic (No such file or directory)",
"at java.io.FileInputStream.open0(Native Method) ~[?:?]",
"at java.io.FileInputStream.open(FileInputStream.java:219) ~[?:?]",
"at java.io.FileInputStream.<init>(FileInputStream.java:157) ~[?:?]",
"at org.wltea.analyzer.dic.Dictionary.loadDictFile(Dictionary.java:196) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.wltea.analyzer.dic.Dictionary.loadMainDict(Dictionary.java:381) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.wltea.analyzer.dic.Dictionary.initial(Dictionary.java:149) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.wltea.analyzer.cfg.Configuration.<init>(Configuration.java:37) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.IkTokenizerFactory.<init>(IkTokenizerFactory.java:15) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.IkTokenizerFactory.getIkSmartTokenizerFactory(IkTokenizerFactory.java:23) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:542) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:350) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:241) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.IndexModule.newIndexService(IndexModule.java:499) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.indices.IndicesService.createIndexService(IndicesService.java:729) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.indices.IndicesService.withTempIndexService(IndicesService.java:671) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexWithTemporaryService(MetadataCreateIndexService.java:460) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequestWithV1Templates(MetadataCreateIndexService.java:565) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:422) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:429) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService$1.execute(MetadataCreateIndexService.java:335) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:65) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.executeTasks(MasterService.java:824) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:395) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.runTasks(MasterService.java:266) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService$Batcher.run(MasterService.java:190) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:176) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:214) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:739) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245) [opensearch-2.0.0.jar:2.0.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
"at java.lang.Thread.run(Thread.java:829) [?:?]"] }
{"type": "server", "timestamp": "2022-06-17T13:54:22,298Z", "level": "ERROR", "component": "o.w.a.d.Dictionary", "cluster.name": "test-opensearch", "node.name": "test-opensearch-0", "message": "ik-analyzer: Surname not found", "cluster.uuid": "H7xcTvEARCy0GNHIryC15w", "node.id": "fuVqL5NwQ0SOyIpT0ju3TQ" , 
"stacktrace": ["java.io.FileNotFoundException: /opt/opensearch/config/analysis-ik/surname.dic (No such file or directory)",
"at java.io.FileInputStream.open0(Native Method) ~[?:?]",
"at java.io.FileInputStream.open(FileInputStream.java:219) ~[?:?]",
"at java.io.FileInputStream.<init>(FileInputStream.java:157) ~[?:?]",
"at org.wltea.analyzer.dic.Dictionary.loadDictFile(Dictionary.java:196) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.wltea.analyzer.dic.Dictionary.loadSurnameDict(Dictionary.java:541) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.wltea.analyzer.dic.Dictionary.initial(Dictionary.java:150) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.wltea.analyzer.cfg.Configuration.<init>(Configuration.java:37) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.IkTokenizerFactory.<init>(IkTokenizerFactory.java:15) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.IkTokenizerFactory.getIkSmartTokenizerFactory(IkTokenizerFactory.java:23) [opensearch-analisys-ik-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:542) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:350) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:241) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.IndexModule.newIndexService(IndexModule.java:499) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.indices.IndicesService.createIndexService(IndicesService.java:729) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.indices.IndicesService.withTempIndexService(IndicesService.java:671) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexWithTemporaryService(MetadataCreateIndexService.java:460) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequestWithV1Templates(MetadataCreateIndexService.java:565) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:422) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:429) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService$1.execute(MetadataCreateIndexService.java:335) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:65) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.executeTasks(MasterService.java:824) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:395) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.runTasks(MasterService.java:266) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService$Batcher.run(MasterService.java:190) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:176) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:214) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:739) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282) [opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245) [opensearch-2.0.0.jar:2.0.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
"at java.lang.Thread.run(Thread.java:829) [?:?]"] }
{"type": "server", "timestamp": "2022-06-17T13:54:22,302Z", "level": "ERROR", "component": "o.o.s.c.ConfigurationRepository", "cluster.name": "test-opensearch", "node.name": "test-opensearch-0", "message": "Cannot apply default config (this is maybe not an error!)", "cluster.uuid": "H7xcTvEARCy0GNHIryC15w", "node.id": "fuVqL5NwQ0SOyIpT0ju3TQ" , 
"stacktrace": ["java.lang.RuntimeException: ik-analyzer: Surname not found!!!",
"at org.wltea.analyzer.dic.Dictionary.loadDictFile(Dictionary.java:211) ~[?:?]",
"at org.wltea.analyzer.dic.Dictionary.loadSurnameDict(Dictionary.java:541) ~[?:?]",
"at org.wltea.analyzer.dic.Dictionary.initial(Dictionary.java:150) ~[?:?]",
"at org.wltea.analyzer.cfg.Configuration.<init>(Configuration.java:37) ~[?:?]",
"at org.opensearch.index.analysis.IkTokenizerFactory.<init>(IkTokenizerFactory.java:15) ~[?:?]",
"at org.opensearch.index.analysis.IkTokenizerFactory.getIkSmartTokenizerFactory(IkTokenizerFactory.java:23) ~[?:?]",
"at org.opensearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:542) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:350) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:241) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.index.IndexModule.newIndexService(IndexModule.java:499) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.indices.IndicesService.createIndexService(IndicesService.java:729) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.indices.IndicesService.withTempIndexService(IndicesService.java:671) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexWithTemporaryService(MetadataCreateIndexService.java:460) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequestWithV1Templates(MetadataCreateIndexService.java:565) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:422) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:429) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.metadata.MetadataCreateIndexService$1.execute(MetadataCreateIndexService.java:335) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:65) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.executeTasks(MasterService.java:824) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:395) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService.runTasks(MasterService.java:266) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.MasterService$Batcher.run(MasterService.java:190) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:176) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:214) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:739) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282) ~[opensearch-2.0.0.jar:2.0.0]",
"at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245) ~[opensearch-2.0.0.jar:2.0.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]",
"at java.lang.Thread.run(Thread.java:829) [?:?]",
"Caused by: java.io.FileNotFoundException: /opt/opensearch/config/analysis-ik/surname.dic (No such file or directory)",
"at java.io.FileInputStream.open0(Native Method) ~[?:?]",
"at java.io.FileInputStream.open(FileInputStream.java:219) ~[?:?]",
"at java.io.FileInputStream.<init>(FileInputStream.java:157) ~[?:?]",
"at org.wltea.analyzer.dic.Dictionary.loadDictFile(Dictionary.java:196) ~[?:?]",
"... 29 more"] }

IMHO it'd make sense to re-write the gradle build script based on the template, as noted in the other issues. then this can probably be solved in a more straight-forward fashion.

rursprung commented 2 years ago

as a workaround it's possible to manually copy the full content of the config/ folder from this repository into config/analyzer-ik. with that it's possible to start up the cluster. the plugin-security.policy is however still missing, acc. to the comment this should however only be a problem when using the hot-reload mechanism (i.e. the rest should work)

rursprung commented 1 year ago

small update (re-tested with 2.2.0):

chunlinyao commented 1 year ago

+1 for this issue