nguyenq / tess4j

Java JNA wrapper for Tesseract OCR API
Apache License 2.0
1.58k stars 372 forks source link

PR for issue 257: Tesseract1.doOCR method doesn't take into account the stride of the image #263

Open kpentaris opened 4 months ago

kpentaris commented 4 months ago

https://github.com/nguyenq/tess4j/issues/257

nguyenq commented 4 months ago

Normally, bytes per line = bytes per pixel * image width, as mentioned in:

https://stackoverflow.com/questions/51590161/bytes-per-pixel-bytes-per-line-how-to-use-function-nativesetimagebytes-in-tes https://stackoverflow.com/questions/9513227/what-is-meant-by-bytesperline-in-qimage

Can you cite some references that support your formula? And please attach an image file that would fail in the current implementation.

Thanks.

kpentaris commented 3 months ago

Here is an SO question that explains this (also has a link to MSDN docs): https://stackoverflow.com/questions/29912303/stride-and-padding-of-an-image

nguyenq commented 3 months ago

Can you attach a sample image that would cause the problem with the current version? It would help in testing pre- and post-fix.