manisandro / gImageReader

A Gtk/Qt front-end to tesseract-ocr.
GNU General Public License v3.0
1.6k stars 188 forks source link

after each line in paragraph automaticaly insert <CR> (Enter) #627

Closed RoDanny2021 closed 1 year ago

RoDanny2021 commented 1 year ago

After recognizing the text, it is divided into paragraphs and lines. I did not find the possibility that should not be inserted at the end of each line It would be useful to be able to tell the program that a paragraph is written without any at the end of the line I have latest version ... gImageReader 3.4.1 (Jan 29 2023) for windows Can make some change?

For translating? how can proceed? I want translate into Romanian.

manisandro commented 1 year ago

There is an option in the plain-text editor to strip line-breaks of recognized text. As far as tesseract itself is concerned, there is no such option according to [1].

Regarding translations: please use Weblate.

[1] https://groups.google.com/g/tesseract-ocr/c/mFA_bUNMQN8?pli=1