NLPchina / ansj_seg

ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典
Apache License 2.0
6.48k stars 2.32k forks source link

"惠城区鹅岭南路" 分词问题 #487

Closed tianxinghd closed 7 years ago

tianxinghd commented 7 years ago

用在线网址来分词 http://www.nlpcn.org:9999/api/SegApi/nlpSeg?content=惠城区鹅岭南路 结果是{"obj":[{"name":"惠城区鹅岭南路","natureStr":"ns","newWord":true,"offe":0,"realName":"惠城区鹅岭南路"}],"ok":true}

但是我自己用5.1.1版本进行分词,分词结果为 惠城区/ns,鹅/n,岭南/s,路/n。 有什么需要配置的地方吗?

ansjsun commented 7 years ago

分词用NlpAnalysis的方式来分

tianxinghd commented 7 years ago

看到网址是nlpseg,我用的也是NlpAnalysis。

Result parse = NlpAnalysis.parse("惠城区鹅岭南路");

ansjsun commented 7 years ago

这就奇怪了..应该是可以的你把完整的代码贴一下我看看

tianxinghd commented 7 years ago

只是简单的测试分词功能 import org.ansj.domain.Result; import org.ansj.splitWord.analysis.NlpAnalysis; import org.ansj.splitWord.analysis.ToAnalysis;

public class WordSegTest {

public static void main(String[] args) {

    Result parse = NlpAnalysis.parse("惠城区鹅岭南路");
    System.out.println(parse);
}

}

tianxinghd commented 7 years ago

下载了5.1.2的版本,就可以了

ansjsun commented 7 years ago

@tianxinghd 你从哪里下载的5.1.2??我在maven库里没有找到额

tianxinghd commented 7 years ago

在git上下载的。 pom上写的版本是5.1.2

4.0.0
<groupId>org.ansj</groupId>
<artifactId>ansj_seg</artifactId>
<packaging>jar</packaging>
<name>ansj_seg</name>
<version>5.1.2</version>
ansjsun commented 7 years ago

哦晓得了你自己编译的啊