nlpinaction / learning-nlp

nlp in action
1.03k stars 823 forks source link

第三章的HMM中维比特算法 #24

Open Miaotxy opened 5 years ago

Miaotxy commented 5 years ago

分词的时候最后的两个状态只可能是s或者e啊,为什么源代码中还出现了m状态啥的。

Miaotxy commented 5 years ago

(prob, state) = max((V[len(obs) - 1][y], y) for y in 'ES')

应该是这样才对吧?

https://github.com/ustcdane/annotated_jieba/blob/master/jieba/finalseg/__init__.py

Lukasjame commented 5 years ago

如果分词为短句子的话没问题,但是一旦是一个段落,就会出错

Traceback (most recent call last): File "D:/code/python_test/test/start.py", line 11, in print(str(list(res))) File "D:/code/python_test/test\hmm.py", line 150, in cut prob, pos_list = self.viterbi(text, self.state_list, self.Pi_dic, self.A_dic, self.B_dic) File "D:/code/python_test/test\hmm.py", line 134, in viterbi for y0 in states if V[t - 1][y0] > 0]) ValueError: max() arg is an empty sequence