Texts in "swenglish"? - Githubissues

henkish commented 1 year ago

Hello!

I have trained a model for parsing invoices (trained on private data set). The model understands the structure of the documents very well!

But I have an issue with texts. It seems model detect correct text (in the document structure), but changes some words to some combination of Swedish-english or Swedish-german versions of the words. Or made up Swedish words.

The problem with the invoices is that texts also can contain product abbreviations, technical terms, etc. - and not always regular Swedish sentences.

Is there some way to get more accurate texts?

Thanks in advance Henrik

henkish commented 1 year ago

Training on additional documents, and more epochs seemed to improve results - but we still get some strange texts sometimes.

Mohtadrao commented 8 months ago

I get strange result all the time. Kindly tell me what to do? @henkish

clovaai / donut

Texts in "swenglish"? #247