pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
4.49k stars 443 forks source link

Pdf file transform to image have a black block #3624

Open Agoin-max opened 5 days ago

Agoin-max commented 5 days ago

Description of the bug

Pdf file transform to image have a black block 89b59dbfae5e4d1d92596418e9585a10.pdf

How to reproduce the bug

def pdf2png_with_pymupdf(pdfdata: Union[bytes, str], matrix: int = 2): """转换图片.""" images: List[Image.Image] = [] path = tempfile.mkdtemp() path = Path(path)

try:
    if isinstance(pdf_data, bytes):
        pdf_path = str(path_.joinpath("mypdf.pdf"))
        with open(pdf_path, "wb") as fs:
            fs.write(pdf_data)
    else:
        pdf_path = pdf_data

    doc = fitz.open(pdf_path)
    for page_index in range(len(doc)):
        page = doc.load_page(page_index)
        pix = page.get_pixmap(matrix=fitz.Matrix(matrix, matrix))  # type: ignore
        img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)  # type: ignore
        images.append(img)
    doc.close()
finally:
    delete_temp_directory(path)

   this is my code

PyMuPDF version

1.24.6

Operating system

Windows

Python version

3.9

JorjMcKie commented 5 days ago

This is an issue in MuPDF. Created an item in its tracker: https://bugs.ghostscript.com/show_bug.cgi?id=707845