zhangmt / jcseg

Automatically exported from code.google.com/p/jcseg
0 stars 0 forks source link

我在lucene中使用jcseg,JcsegAnalyzer4X类没有实现tokenStream方法也能用吗?(已解决) #20

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Analyzer analyzer = new JcsegAnalyzer4X(JcsegTaskConfig.SIMPLE_MODE);

TokenStream stream = analyzer.tokenStream("", new StringReader(str));

按pdf里文档说的,然后把analyzer 
给lucene,对句子分词的时候报调用抽象方法错误。

初次接触,问题可能比较蠢,望解答。

Original issue reported on code.google.com by eason.li...@gmail.com on 26 Mar 2014 at 8:40

GoogleCodeExporter commented 9 years ago
直接:Analyzer analyzer = new JcsegAnalyzer4X(JcsegTaskConfig.COMPLEX_MODE); 
然后将analyzer给lucene,不需要:TokenStream stream = 
analyzer.tokenStream("", new StringReader(str));

然后就ok了,注意lucene的版本,4.0以上的版本直接这么用,luc
ene的接口变化很快。

具体还可以参考下:https://www.google.com.hk/search?q=jcseg+lucene&oq=jc
seg+lucene&aqs=chrome..69i57j69i60l2.1897j0j7&sourceid=chrome&espv=210&es_sm=93&
ie=UTF-8

Best
--lionsoul

Original comment by chenxin6...@gmail.com on 26 Mar 2014 at 1:26

GoogleCodeExporter commented 9 years ago
还是没明白..
       Analyzer analyzer = new JcsegAnalyzer4X(JcsegTaskConfig.COMPLEX_MODE);
        JcsegAnalyzer4X jcseg = (JcsegAnalyzer4X) analyzer;
        JcsegTaskConfig jcsegTaskConfig = jcseg.getTaskConfig();
        jcsegTaskConfig.setAppendCJKPinyin(true);
        jcsegTaskConfig.setAppendCJKSyn(true);
        try {
            TokenStream tokenStream = analyzer.tokenStream(null,
                    new StringReader("中华人民共和国成立了 welcome to china"));
            org.apache.lucene.analysis.tokenattributes.CharTermAttribute charTermAttribute = tokenStream
                    .addAttribute(CharTermAttribute.class);
            tokenStream.reset();
            while (tokenStream.incrementToken()) {
                System.out.println(charTermAttribute.toString());
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

提示Exception in thread "main" java.lang.AbstractMethodError: 
org.apache.lucene.analysis.Analyzer.tokenStream(Ljava/lang/String;Ljava/io/Reade
r;)Lorg/apache/lucene/analysis/TokenStream;

Original comment by eason.li...@gmail.com on 26 Mar 2014 at 2:49

GoogleCodeExporter commented 9 years ago
嗯,从错误来看tokenStream是父类的抽象方法,JcsegAnalyzer4X没有
重写这个方法,
至于为什么没有实现这个方法也可以用,我也不明确,extends�
��类的时候也没有提示要一定实现这个方法。

{你可以去找到Analyzer的源码,然后研究研究,我对lucene也不��
�悉}

如果你是想试用Jcseg,就查看里面的demo吧。

如果是想给lucene试用,谷歌一下“jcseg lucene”。

Best
--lionsoul

Original comment by chenxin6...@gmail.com on 27 Mar 2014 at 2:57

GoogleCodeExporter commented 9 years ago
不要通过下面这种方式来调用lucene的测试:

TokenStream tokenStream = analyzer.tokenStream(null,
                    new StringReader("中华人民共和国成立了 welcome to china"));
            org.apache.lucene.analysis.tokenattributes.CharTermAttribute charTermAttribute = tokenStream
                    .addAttribute(CharTermAttribute.class);
            tokenStream.reset();
            while (tokenStream.incrementToken()) {
                System.out.println(charTermAttribute.toString());
            }

直接使用jcseg的demo就好。

Original comment by chenxin6...@gmail.com on 27 Mar 2014 at 2:58