选中的pdf文本,在翻译的时候,经常因为识别出来的文本空格错误导致翻译效果很差。
选中的文本如果跨行了,识别出来的文本,跨行的两个单词之间并没有空格,直接变成了一个词。
比如:
there has been growing interest in parameter-efficient methods to apply
these models to downstream tasks.
识别成了: there has been growing interest in parameter-efficient methods to applythese models to downstream tasks.
正常的应该是: there has been growing interest in parameter-efficient methods to apply these models to downstream tasks.
有的跨行的文本是一个单词拆分的,然后识别出来的结果没有把跨行符去掉:
比如:
As pre-trained language models have got-
ten larger
识别成了:As pre-trained language models have got-ten larger
应该是:As pre-trained language models have gotten larger
选中的pdf文本,在翻译的时候,经常因为识别出来的文本空格错误导致翻译效果很差。 选中的文本如果跨行了,识别出来的文本,跨行的两个单词之间并没有空格,直接变成了一个词。 比如: there has been growing interest in parameter-efficient methods to apply these models to downstream tasks. 识别成了: there has been growing interest in parameter-efficient methods to applythese models to downstream tasks. 正常的应该是: there has been growing interest in parameter-efficient methods to apply these models to downstream tasks.
有的跨行的文本是一个单词拆分的,然后识别出来的结果没有把跨行符去掉: 比如: As pre-trained language models have got- ten larger 识别成了:As pre-trained language models have got-ten larger 应该是:As pre-trained language models have gotten larger
麻烦作者看到可以修改一下这个文本识别,毕竟这个对最后的翻译结果影响非常的巨大。谢谢啦!