Open GoogleCodeExporter opened 9 years ago
From <http://www.djvu-soft.narod.ru/soft/djvuocr_en.htm>:
The idea […] is to avoid the problem when a hyphenated word is split
into two parts, and cannot be found when performing search in DJVU
files. For example:
"this function is int-"
"egrable on an interval..."
The word "integrable" cannot be found by searching, only the pieces
of it, "int" and "egrable". The new method is to repeat the entire
word in the OCR text, […]:
"this function is "
"integrable on an interval..."
I suppose we could do that, although we would have to lie about coordinates of
the hyphenated word.
Original comment by jwilk@jwilk.net
on 21 Nov 2013 at 9:48
set coordiate to start of hypheated word - that correct
end - may be omitted
Original comment by mivan...@gmail.com
on 4 Jul 2014 at 3:27
Original issue reported on code.google.com by
mivan...@gmail.com
on 27 Apr 2013 at 11:36