sing1ee / analyzer-solr

analyzer adapter for solr 5, we support Jieba, and stranford in the future
MIT License
61 stars 27 forks source link

我现在也要配置这个。可是刚接触这个solr #4

Closed hhy5861 closed 8 years ago

hhy5861 commented 8 years ago

不知道怎么把结巴这个分词配置到里面,请有什么说明文档吗?谢谢!

比如果分词库放到那里。谢谢!

sing1ee commented 8 years ago

编辑配置文件:

/xxx/solr-5.0.0/server/solr/your_solr_name/conf/schema.xml

加入jieba分词的配置,之后,就可以用text_jieba来作为field类型了。

<fieldType name="text_jieba" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="analyzer.solr5.jieba.JiebaTokenizerFactory"  segMode="SEARCH"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="analyzer.solr5.jieba.JiebaTokenizerFactory"  segMode="SEARCH"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English"/>
      </analyzer>
    </fieldType>

另外需要把结巴分词的jar包,以及analyzer-solr的jar包放在目录:

/xxx/solr-5.0.0、server/solr-webapp/webapp/WEB-INF/lib/jieba-analysis-1.0.0.jar
/xxx/solr-5.0.0、server/solr-webapp/webapp/WEB-INF/lib/analyzer-solr-1.0.jar

然后就可以了。

hhy5861 commented 8 years ago

非常感谢!

hhy5861 commented 8 years ago

jieba-analysis-1.0.0.jar 没有找到?

sing1ee commented 8 years ago

@hhy5861 https://github.com/huaban/jieba-analysis

hhy5861 commented 8 years ago

好的。再次谢谢你百忙中回复我的问题。

hhy5861 commented 8 years ago

5.3.1配置了导入数据出错了。

951040 INFO (Thread-19) [ x:Restaurant_Index] o.a.s.u.p.LogUpdateProcessor [Restaurant_Index] webapp=/solr path=/dataimport params={debug=false&optimize=false&indent=true&commit=true&clean=true&wt=json&command=full-import&entity=RestaurantIndex&verbose=false} status=0 QTime=3 {deleteByQuery=:_ (-1517990524086124544)} 0 267 Exception in thread "Thread-19" java.lang.NoSuchFieldError: word at analyzer.solr5.jieba.JiebaTokenizer.incrementToken(JiebaTokenizer.java:38) at org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:51) at org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:45) at org.apache.lucene.analysis.snowball.SnowballFilter.incrementToken(SnowballFilter.java:90) at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:613) at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:344) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:234) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1475) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104) at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71) at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)

hhy5861 commented 8 years ago

build的时候出错 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-gpg-plugin:1.4:sign (sign-artifacts) on project jieba-analysis: Exit code: 2 -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

sing1ee commented 8 years ago

@hhy5861 你到jieba-analysis的项目去问问吧,或者试试这个:http://mvnrepository.com/artifact/com.huaban/jieba-analysis/1.0.2