CrfppTrainHanLPLoad执行时报 “[0/n 构造单词时失败”的错误

Describe the bug CrfppTrainHanLPLoad程序在执行时，如果文本中出现"[0\n"这样的字符串，会报空指针错误：

debug发现，对HanLP-1.8.3/src/main/java/com/hankcs/hanlp/model/perceptron/utility/IOUtility.java，第73行进行修改

# if (sentence.wordList.size() == 0) continue;
if (sentence==null || sentence.wordList.size() == 0) continue;

但是又报：[0/n 构造单词时失败 debug发现： /root/repo/1.8.3/HanLP-1.8.3/src/main/java/com/hankcs/hanlp/corpus/document/sentence/word/WordFactory.java

public static IWord create(String param)
    {
        if (param == null) return null;
        if (param.startsWith("[") && !param.startsWith("[/"))
        {
            return CompoundWord.create(param);
        }
        else
        {
            return Word.create(param);
        }
    }

这个意思是如果以"["开头的话，调用CompoundWord.create，继续debug发现，在CompoundWord类中的create方法中的这段代码

public static CompoundWord create(String param)
    {
        if (param == null) return null;
        int cutIndex = param.lastIndexOf(']');
        if (cutIndex <= 2 || cutIndex == param.length() - 1) return null;

我理解如果包含了这种的分词结果："[0/n"，那执行的结果肯定是null，这里为什么这样处理，能解释一下么？我想应该是有理由的，但是这里执行上有问题。能解释一下非常感谢！！

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.

public void testCwsTest() {

        str = "操作/vn 时间/n ：/w 20210115/m 15/m :/w 19/m :/w 操作人/n ：/w 刘X/nr (/w 1390000005/m )/w 重新处理/n " +
                "说明/v ：/w 角色/n 内/f 产品/n 总/b 数量/n 为/p [0/n ]/w ,/w 不/d 满足/v 角色/n [/w 201908011652/m :/w :/w" +
                " IPTV/nz 设备/n 产品/n 分组/n ]/w 设定/v 的/u 最/d 小值/n 为/v [/w 1/m ]/w 的/u 限制/vn ，/w 请/v 对/p 错误/n" +
                " 界面/n 进行/v 截图/vn 并/c 发送给/n 系统管理员/nnt ;/w";
        Sentence sentence;
        if (str == null) {
            System.out.println("nok");
        } else {
            str = str.trim();
            if (str.isEmpty()) {
                System.out.println("nok");
            } else {
                Pattern pattern = Pattern.compile("(\\[(([^\\s\\]]+/[0-9a-zA-Z]+)\\s+)+?([^\\s\\]]+/[0-9a-zA-Z]+)]/?[0-9a-zA-Z]+)|([^\\s]+/[0-9a-zA-Z]+)");
                Matcher matcher = pattern.matcher(str);
                List<IWord> wordList = new LinkedList();

                while (matcher.find()) {
                    String single = matcher.group();
                    IWord word = WordFactory.create(single);
                    if (word == null) {
                        System.out.println("在用 " + single + " 构造单词时失败，句子构造参数为 " + str);

                    }

                    wordList.add(word);
                }

            }

        }
    }

Describe the current behavior 100%重现问题

Expected behavior 执行成功

SystemX information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): centos 7
Python version:
jdk version: jdk1.8
HanLP version: 1.7.5 和1.8.3

Other info / logs

[x] I've completed this form and searched the web for solutions.

hankcs / HanLP

CrfppTrainHanLPLoad执行时报 “[0/n 构造单词时失败”的错误 #1750