cyanfish / naps2

Scan documents to PDF and more, as simply as possible.
https://www.naps2.com
Other
2.58k stars 315 forks source link

OCR text alignment in NAPS 7.4.1 #353

Closed NextTherapist closed 2 months ago

NextTherapist commented 2 months ago

It should be improved according to the changelog. But this is what I get:

7.4.1.pdf

alignment-7 4 1

cyanfish commented 2 months ago

Do you have a copy of the PDF before running OCR? And did you do any editing other than running OCR?

Also, what are your exact OCR settings? Language is German? Fast/Best? Is "Fix white balance" checked?

NextTherapist commented 2 months ago

Do you have a copy of the PDF before running OCR? And did you do any editing other than running OCR?

Also, what are your exact OCR settings? Language is German? Fast/Best? Is "Fix white balance" checked?

Input file was this: B-W without OCR.pdf

OCR settings are German, fast, no white balance. And no, no editing except saving with OCR activated.

cyanfish commented 2 months ago

Can you try this test version and see if it resolves the issue for you?

naps2-7.4.1-win-testocr.zip

NextTherapist commented 2 months ago

No, it's the same thing: naps2-test.pdf

I can remember I saw this before in an older version (before 7.4.0). Perhaps it was 7.3.0 or 7.3.1.

cyanfish commented 2 months ago

That's very weird, as using that test version, and running on the B-W.without.OCR.pdf file you provided, this is what I get: naps2-test2.pdf

NextTherapist commented 2 months ago

That's really weird. I thought it came from your PDF version (1.4) and tried the same, but it was the same error. I uninstalled 7.4.1 and installed 7.4.0 again and 7.4.0 is without that error: naps2-test3-7.4.0.pdf

But it is some KB bigger than your file. At least on my system (Windows 10 64bit) 7.4.1 works not like 7.4.0 any more.

Edit: No difference = same error on 7.4.1 with OCR on best quality.

cyanfish commented 2 months ago

I think I know the problem, I forgot to do locale-invariant number parsing (1.000 vs 1,000 issue).

cyanfish commented 2 months ago

This should be fixed in 7.4.2. Thanks for the help with testing!

NextTherapist commented 2 months ago

@cyanfish Thank you for fixing it! :-)