I reproduce your training model and generate the best model to inference on my special business sample as I DO not have more business data, but found some issue.
During inference, the model will show the overcorrection: check the correct word to wrong word like this:
{
"paragraph": "在本合同中,除上下文另有规定外,下列用语应当具有如下含义:",
"error_fragments": [
{
"error_init_id": 26,
"error_end_id": 27,
"src_fragment": "含", -> the correct word in paragraph
"tgt_fragment": "涵" -> model output wrong word
}
]
}
So, there will be one white_name_list config and code to fix this overcorrection issue.
I'd like to pull this code in tools/inference.py.
So are you approve and do you hava any good idea ? Let's keep talk, thx
你好! Good Job!
I reproduce your training model and generate the best model to inference on my special business sample as I DO not have more business data, but found some issue. During inference, the model will show the overcorrection: check the correct word to wrong word like this: { "paragraph": "在本合同中,除上下文另有规定外,下列用语应当具有如下含义:", "error_fragments": [ { "error_init_id": 26, "error_end_id": 27, "src_fragment": "含", -> the correct word in paragraph "tgt_fragment": "涵" -> model output wrong word } ] } So, there will be one white_name_list config and code to fix this overcorrection issue. I'd like to pull this code in tools/inference.py. So are you approve and do you hava any good idea ? Let's keep talk, thx
BestRegards Yazhou