Closed satvik-27199 closed 5 months ago
Sorry - I am not able follow your code. PyMuPDF image insertion is exclusively responsible for inserting a given image in a given target rectangle of a PDF in such a way that
Among other things this means: If the aspect ratios of target rect and image do not exactly coincide, there will remain unused stripes in the target rectangle. If you later look at the bbox on the page covered by the image, it will in general not be the same as the original insert rectangle.
To demonstrate a bug, please provide me with a simple (!) example where these criteria are violated.
Closing this for lack of response. Please feel free to reopen with a reproducer case.
Description of the bug
I have observed some inconsistencies in aligning image coordinates with the PDF coordinate system. When using an image's bounding box as input, the objective is to convert it into the PDF coordinate system. This conversion is crucial for accurately extracting word-level bounding boxes using PyMuPDF.
The inconsistency is specifically shown in 'page_48.pdf'. Additionally, an example '5.pdf' is provided, demonstrating that the output appears consistent.
The corresponding image files are also attached. 'page_48.jpeg' 'page_5.jpeg' pdf_visualization (2).pdf
5.pdf page_48.pdf
Image and PDF Folder
How to reproduce the bug
bbox = [127.1285171508789, 236.234619140625, 1547.159912109375, 673.4690551757812] ## For 48.pdf
bbox = [209.13824462890625, 447.0567321777344, 1482.3857421875, 953.8846435546875] ## For 5.pdf
image_path = '/content/page_5.jpeg' pdf_path = '/content/5.pdf'
import cv2 import matplotlib.pyplot as plt
image = cv2.imread(image_path)
if image is None: print(f"Failed to load image at {image_path}") else: x1, y1, x2, y2 = map(int, bbox) # Example coordinates cv2.rectangle(image, (x1, y1), (x2, y2), (0, 0, 255), 4) image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) plt.imshow(image_rgb) plt.axis('off')
def scale_coordinates(matrix, original_width, original_height, target_width, target_height): scaled_matrix = [] for coordinates in matrix: x1, y1, x2, y2 = coordinates
Scale coordinates
doc = fitz.open(pdf_path) page = doc[0] pdf_width, pdf_height = page.rect.width, page.rect.height im = cv2.imread(image_path) img_height, img_width, channels = im.shape
original_width = img_width original_height = img_height target_width = pdf_width target_height = pdf_height
original_width, original_height, target_width, target_height
bbox_list = [bbox] bbox_apapted = scale_coordinates(bbox_list, original_width, original_height, target_width, target_height) bbox_apapted
import fitz
doc = fitz.open(pdf_path) page = doc[0]
rect = fitz.Rect(bbox_apapted[0][0], bbox_apapted[0][1], bbox_apapted[0][2], bbox_apapted[0][3]) color = (1, 0, 0) # Red page.draw_rect(rect, color=color, width=1.5, overlay=True)
output_pdf_path = "/content/pdf_visualization.pdf" doc.save(output_pdf_path) doc.close()
PyMuPDF version
1.24.1
Operating system
MacOS
Python version
3.10