pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
4.49k stars 443 forks source link

`'width'` in `Page.get_drawings()` returns width equal as 0 #3591

Closed Rodrigodd closed 5 days ago

Rodrigodd commented 1 week ago

Description of the bug

I have a PDF with the following drawings:

q
15 w
0 1 1 0 0 0 cm
100 100 m
500 100 l
S
Q

q
15 w
100 100 m
500 100 l
S
Q

They are just a line stroke, with the difference that the first one has a cm operation, which flips the drawing along the diagonal. They both have a stroke of 15 (15 w).

When I use Page.get_drawings() and get the width (which is the stroke width), it returns a line width of 0 for the first one and 15 for the second. I expect both to have the same width.

I didn't look further than this, but it looks like the problem is the transformation with a negative determinant.

How to reproduce the bug

Run the following script:

import fitz
doc = fitz.open("stroke-repro.pdf")
for page in doc:
    drawings = page.get_drawings()
    for drawing in drawings:
        print('width:', drawing['width'])

With the following file:

stroke-repro.pdf

Which givens the following output:

width: 0.0
width: 15.0

PyMuPDF version

1.24.5

Operating system

Windows

Python version

3.11

JorjMcKie commented 1 week ago

Thanks for submitting this!

We incorrectly set the scaling factor to 0 for rotations.

julian-smith-artifex-com commented 5 days ago

Fixed in 1.24.6.