shibing624 / pycorrector

pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
https://www.mulanai.com/product/corrector/
Apache License 2.0
5.51k stars 1.09k forks source link

将多个短句子合并在一起之后使用macbert4csc就无法识别到错别字了,是什么原因呢? #410

Closed helloHKTK closed 6 months ago

helloHKTK commented 1 year ago

将多个短句子合并在一起之后使用macbert4csc就无法识别到错别字了: ["真麻烦你了。希望你们好好的跳无。少先队员因该为老人让坐。机七学习是人工智能领遇最能体现智能的一个分知。"] 而单条句子却可以识别出来: ["真麻烦你了。希望你们好好的跳无", "少先队员因该为老人让坐", "机七学习是人工智能领遇最能体现智能的一个分知"]

shibing624 commented 1 year ago

模型训练时用的短句训练的,具体可以查看训练集。预测时一般用标点符号切分为短句

stale[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问)