Closed lvjiujin closed 3 years ago
朋友,你跑的时候数据集是什么格式,我用之前的数据集,在seg = text[0][1]这里会报超过范围,好像我的格式不太对,如果你能跑通的话希望你能解答一下,或者分享一下你的数据集
the seg info seems be mapped to the tag of "BIES"(begin, inside, end and single), and from this code section, 4 probably means S as single word. you can double check with the full implementation.
`
`
data_process.py 中处理分词没怎么看懂?就是上述代码,当分词为最后一行时seg ==='0'替换为4,否则替换为3. 不是最后一行的话,若为0就看下一个seg是否为0,若为0则替换为4,否则替换为1 等等。这个逻辑是什么?能解释一下吗?