Open v5tech opened 9 years ago
elasticsearch-analysis-ansj 主页
https://github.com/4onni/elasticsearch-analysis-ansj
ansj 分词器
https://github.com/NLPchina/ansj_seg
插件编译安装
<elasticsearch.version>1.7.1</elasticsearch.version> <dependency> <groupId>org.ansj</groupId> <artifactId>ansj_seg</artifactId> <classifier>min</classifier> <version>2.0.8</version> <scope>compile</scope> </dependency>
mvn assembly:assembly
elasticsearch-1.7.1\bin>plugin -u file:///C:\Users\Administrator\Desktop\elasticsearch-analysis-ansj\target\releases\elasticsearch-analysis-ansj-1.x.1-release.zip -i ansj
index: analysis: analyzer: index_ansj: type: ansj_index query_ansj: type: ansj_query ik: alias: [news_analyzer_ik,ik_analyzer] type: org.elasticsearch.index.analysis.IkAnalyzerProvider mmseg: alias: [news_analyzer, mmseg_analyzer] type: org.elasticsearch.index.analysis.MMsegAnalyzerProvider index.analysis.analyzer.default.type : "ansj_index"
详细配置可参考elasticsearch.yml.example
http://127.0.0.1:9200/articles/_analyze?analyzer=ansj_index&text=我们是中国人
注:其中articles是索引名称,除articles外的所有请求url参数部分均为固定写法。analyzer=ansj_index指定索引分词器,text后为要索引的内容 输出:
articles
analyzer=ansj_index
text
{ "tokens": [ { "token": "我们", "start_offset": 0, "end_offset": 2, "type": "word", "position": 1 }, { "token": "是", "start_offset": 2, "end_offset": 3, "type": "word", "position": 2 }, { "token": "中国", "start_offset": 3, "end_offset": 5, "type": "word", "position": 3 }, { "token": "人", "start_offset": 5, "end_offset": 6, "type": "word", "position": 4 } ] }
http://127.0.0.1:9200/articles/_analyze?analyzer=ansj_query&text=我们是中国人
注:其中articles是索引名称,除articles外的所有请求url参数部分均为固定写法。analyzer=ansj_query指定查询分词器,text后为要查询的内容 输出:
analyzer=ansj_query
你好,怎么感觉索引分词和查询分词没什么区别?结果是一模一样的!
elasticsearch-analysis-ansj 主页
https://github.com/4onni/elasticsearch-analysis-ansj
ansj 分词器
https://github.com/NLPchina/ansj_seg
插件编译安装
1. 克隆源码
2. 修改pom文件配置
3.编译插件
4. 插件安装
5. 配置ansj分词器
详细配置可参考elasticsearch.yml.example
6. 测试及使用
注:其中
articles
是索引名称,除articles
外的所有请求url参数部分均为固定写法。analyzer=ansj_index
指定索引分词器,text
后为要索引的内容 输出:注:其中
articles
是索引名称,除articles
外的所有请求url参数部分均为固定写法。analyzer=ansj_query
指定查询分词器,text
后为要查询的内容 输出: