不使用词典的分词:
媒体 计算 研究所 成立 了 , 高级 数据 挖掘 ( data mining ) 很
难 。
媒体 计算 研究所 成立 了 , 高级 数据 挖掘 ( data mining ) 很
难 。
设置临时词典:
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:39)
使用词典的分词:
媒体计算研究所 成立 了 , 高级 数据挖掘 很 难
使用不严格的词典的分词:
媒体计算研究所 成立 了 , 高级 数据挖掘 很 难
我 送给 力学系 的 同学 一 个 玩具 ( 送给 给力 力学 力学系
都 在 词典 中 )
处理文件:
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
java.lang.ArrayIndexOutOfBoundsException: 1
at edu.fudan.nlp.cn.tag.format.FormatCWS.toString(FormatCWS.java:82)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:146)
at edu.fudan.nlp.cn.tag.CWSTagger.tag(CWSTagger.java:1)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:124)
at edu.fudan.nlp.cn.tag.AbstractTagger.tagFile(AbstractTagger.java:109)
at edu.fudan.example.nlp.ChineseWordSegmentation.main(ChineseWordSegmentation.java:61)
但当开启英文预处理后,又不会出现以上错误:
32行的语句注释后:tag.setEnFilter(false);
这是什么原因?
Original issue reported on code.google.com by leco...@gmail.com on 3 May 2013 at 3:18
Original issue reported on code.google.com by
leco...@gmail.com
on 3 May 2013 at 3:18