Closed xiaolibuzai-ovo closed 1 week ago
Look at this code:
Looks like the PyMuPDF coordinates are correct. That mysterious "other OCR" tool provides coordinates outside the dimension of an A4 page: x-values should not exceed 596, but we see a value 1250.
I also do not understand why we talk about OCR at all: the text can be extracted with no problem, and none of the 42 images covers the page.
Look at this code: 看这段代码:
Looks like the PyMuPDF coordinates are correct. That mysterious "other OCR" tool provides coordinates outside the dimension of an A4 page: x-values should not exceed 596, but we see a value 1250.看起来 PyMuPDF 坐标是正确的。这个神秘的“其他 OCR”工具提供 A4 页面尺寸之外的坐标:x 值不应超过 596,但我们看到的值是 1250。
I also do not understand why we talk about OCR at all: the text can be extracted with no problem, and none of the 42 images covers the page.我也不明白为什么我们要谈论OCR:文本可以毫无问题地提取,并且42张图像没有一张覆盖页面。
Thank you for your reply. I am currently facing this issue: using PyMuPDF to recognize the PDF leads to inaccuracies in the content recognition. For example, I used the translation script from https://github.com/pymupdf/PyMuPDF-Utilities/blob/tutorials/tutorials/language-translation/translator.py, but the restored content differs significantly. Here is my PDF: Nuxtjs-Cheat-Sheet.pdf result:
I can accurately select the positions using another OCR, so my idea is to have the OCR find the positions, and then translate and write back by extracting the matrix content.
other OCR result:
Still don't understand why we even talk about OCR. PyMuPDF can correctly detect all text natively without any problem and top precision:
I think we are talking past each other:
Language translation of a given document maybe?
I am transferring this post to "Discussions" as we are clearly not dealing with a bug.
Description of the bug
I used another OCR to recognize the content coordinates of the PDF, and then I used the PyMuPDF library. I hope to extract the coordinates of a specified area, but there is a significant difference between the two sets of coordinates.
These are the coordinates recognized by the other OCR: { "text": "Vue Mastery", "bbox": [ 586.0, 178.0, 1250.0, 296.0 ], "type": "ocr", "score": 1 } These are the coordinates for the corresponding position in PyMuPDF: (88.85449981689453, 23.943227767944336, 117.37201690673828, 44.796356201171875, 'Vue', 0, 0, 0), (121.81803131103516, 23.943227767944336, 183.36544799804688, 44.796356201171875, 'Mastery', 0, 0, 1)
this is pdf file Nuxtjs-Cheat-Sheet.pdf
How to reproduce the bug
see above
Hope to be answered
PyMuPDF version
1.24.14
Operating system
MacOS
Python version
3.10