taishan1994 / Gector_chinese

基于seq2edit (Gector) 的中文文本纠错。
26 stars 6 forks source link

这里的代码insert只支持替换一个字,这就是seq2edit需要多次预测的原因? #4

Closed yongzhuo closed 6 months ago

yongzhuo commented 6 months ago

这里的代码insert只支持替换一个字,这就是seq2edit需要多次预测的原因?

 for diff in diffs:
            tag, i1, i2, j1, j2 = diff
            if tag == 'replace' and i2 - i1 == j2 - j1:
                replace_idx_list += [(i, '$REPLACE_' + trg_text[j])
                                     for i, j in zip(range(i1, i2), range(j1, j2))]
            elif tag == 'insert' and j2 - j1 == 1:
                missing_idx_list.append((i1 - 1, '$APPEND_' + trg_text[j1]))
taishan1994 commented 6 months ago

迭代一次可能有的错误还没改对