tesseract-ocr / langdata

Source training data for Tesseract for lots of languages
Apache License 2.0
834 stars 888 forks source link

Fix extra intra-word spacing in Chinese and Japanese (GitHub issue #991) #143

Closed stweil closed 5 years ago

stweil commented 5 years ago

Add preserve_interword_spaces 1 to the *_vert.traineddata.

It can be removed now from traineddata which loads those files as a sublanguage.

Signed-off-by: Stefan Weil sw@weilnetz.de

stweil commented 5 years ago

@Shreeshrii, I changed now *_vert.traineddata, too.