gitabtion / BertBasedCorrectionModels

PyTorch impelementations of BERT-based Spelling Error Correction Models. 基于BERT的文本纠错模型,使用PyTorch实现。
Apache License 2.0
265 stars 43 forks source link

Could I jion the program to pull the request?? #42

Closed Yazooliu closed 1 year ago

Yazooliu commented 1 year ago

你好! Good Job!

I reproduce your training model and generate the best model to inference on my special business sample as I DO not have more business data, but found some issue. During inference, the model will show the overcorrection: check the correct word to wrong word like this: { "paragraph": "在本合同中,除上下文另有规定外,下列用语应当具有如下含义:", "error_fragments": [ { "error_init_id": 26, "error_end_id": 27, "src_fragment": "含", -> the correct word in paragraph "tgt_fragment": "涵" -> model output wrong word } ] } So, there will be one white_name_list config and code to fix this overcorrection issue. I'd like to pull this code in tools/inference.py. So are you approve and do you hava any good idea ? Let's keep talk, thx

BestRegards Yazhou

gitabtion commented 1 year ago

您好,非常感谢!白名单是一个在产品上非常实用的功能,欢迎您提交pr,谢谢您为本仓库的贡献!

Yazooliu commented 1 year ago

this PR is merged and issue is closed

Yazooliu commented 1 year ago

this PR is merged and issue is closed