Open casper-hansen opened 2 years ago
Frankly this is a very puzzling issue. No as far as I know there should not be discrepancies between Windows and Linux provided that you use the same Poppler version. Did you ever find the root cause?
Frankly this is a very puzzling issue. No as far as I know there should not be discrepancies between Windows and Linux provided that you use the same Poppler version. Did you ever find the root cause?
No, I never found the root cause. I assume this is a Poppler issue and not a pdf2image issue.
@casperbh96 it is maybe because of difference in fonts installed between windows and linux. #201 looks similar to this.
Hi @Belval
I have developed an application using the Windows OS, but now I want to deploy it on Linux. Unfortunately, no matter how I install poppler and pdf2image, I cannot get the same results across operating systems - and the Linux quality is worse for OCR, somehow.
For instance, I used to be able to capture the name "Jan Andersen" converting a PDF to PNG and running OCR. But on Linux, the output is instead "J an Andersen". If I instead save the PNG on Windows but run the OCR on Linux, I get the correct result "Jan Andersen". Therefore, I narrowed it down to the conversion stage.
What would be your recommendation to produce the exact same results on both systems - so I can uplift the accuracy on Linux?
Do I just have to accept that there are differences?
Solutions that I tried