emacs-eaf / eaf-pdf-viewer

Fastest PDF Viewer in Emacs
GNU General Public License v3.0
76 stars 25 forks source link

Certain PDFs failing to open with error "ValueError: rect not in mediabox" #63

Closed Jdogzz closed 2 years ago

Jdogzz commented 2 years ago

Describe the bug I drag and drop a PDF such as this one (I can provide other examples which fail in the same way) into the emacs buffer:

mwb_E_202201.pdf

The buffer goes blank. I do not have this issue with other PDFs, such as this one:

https://arxiv.org/abs/2002.01247

Previously both of these PDFs successfully loaded in eaf-pdf-viewer. However, in the process of trying out the latest pymupdf wheel (related to #62) these certain PDFs like the first one no longer load (but other PDFs like the second one work as expected). The eaf buffer shows the following error when I drag and drop the first PDF:

Traceback (most recent call last):
  File "/home/mymailuser/emacs/emacs-application-framework/app/pdf-viewer/buffer.py", line 1158, in paintEvent
    qpixmap = self.get_page_pixmap(index, self.scale * hidpi_scale_factor, self.rotation)
  File "/home/mymailuser/emacs/emacs-application-framework/app/pdf-viewer/buffer.py", line 1079, in get_page_pixmap
    page = self.document[index]
  File "/home/mymailuser/emacs/emacs-application-framework/app/pdf-viewer/buffer.py", line 475, in __getitem__
    page = PdfPage(self.document[index], index, self.document.isPDF)
  File "/home/mymailuser/emacs/emacs-application-framework/app/pdf-viewer/buffer.py", line 584, in __init__
    self._page_rawdict = self._init_page_rawdict()
  File "/home/mymailuser/emacs/emacs-application-framework/app/pdf-viewer/buffer.py", line 599, in _init_page_rawdict
    set_page_crop_box(self.page)(self.clip)
  File "/home/mymailuser/.local/lib/python3.7/site-packages/fitz/fitz.py", line 6525, in set_cropbox
    return self._set_pagebox("CropBox", rect)
  File "/home/mymailuser/.local/lib/python3.7/site-packages/fitz/fitz.py", line 6520, in _set_pagebox
    raise ValueError("rect not in mediabox")
ValueError: rect not in mediabox

To Reproduce

  1. Download the first PDF I included above.
  2. Open emacs.
  3. Drag and drop the PDF into the emacs buffer.

Expected behavior The eaf-pdf-viewer successfully opens the PDF for reading.

Versions (please complete the following info):

Screenshots Here is a screenshot showing what I see after dragging and dropping the PDF: image

Additional context Add any other context about the problem here.

manateelazycat commented 2 years ago

@luhuaei Can you help fix this ?

luhuaei commented 2 years ago

@luhuaei Can you help fix this ?

:ok_hand: I assign myself.

luhuaei commented 2 years ago

Previously both of these PDFs successfully loaded in eaf-pdf-viewer. However, in the process of trying out the latest pymupdf wheel (related to https://github.com/emacs-eaf/eaf-pdf-viewer/issues/62)

@Jdogzz That is to say you install pre-version pymupdf(1.19.6 wheel) cause above error, but previously pymupdf can successfully loaded both PDFs?

But I use pymupdf-1.19.4 open the mwb_E_202201.pdf can reproduce above problem.

>>> import fitz
>>> fitz.__doc__
'\nPyMuPDF 1.19.4: Python bindings for the MuPDF 1.19.0 library.\nVersion date: 2022-01-01 00:00:01.\nBuilt for Python 3.9 on linux (64-bit).\n'
>>> 

Maybe it's not a version issue.

Jdogzz commented 2 years ago

Thanks for taking a look at this.

That's correct, both PDFs worked in a previous version, and sadly I no longer know which version I had installed before.

With some testing I've been able to include more information. For posterity, repeating your python output I get this for fitz on my end after having installed the preview wheel:

Python 3.7.3 (default, Jan 22 2021, 20:04:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fitz
>>> fitz.__doc__
'\nPyMuPDF 1.19.6: Python bindings for the MuPDF 1.19.0 library.\nVersion date: 2022-03-01 00:00:01.\nBuilt for Python 3.7 on linux (64-bit).\n'

I am able to reproduce the error I originally reported with the first PDF in a clean test environment, an Ubuntu 21.10 VM, with a freshly compiled copy of emacs from git and following the bare bones setup from the eaf README. This has occurred with the stable version of pymupdf. Here's the relevant info:

python3-pyqt5 5.15.4+dfsg-3 python3-dbus 1.2.16-5 python3-pyqt5.qtwebengine 5.15.4-1

GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.30, cairo version 1.16.0) of 2022-02-23

Python 3.9.7 (default, Sep 10 2021, 14:59:43) 
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fitz
>>> fitz.__doc__
'\nPyMuPDF 1.19.5: Python bindings for the MuPDF 1.19.0 library.\nVersion date: 2022-02-01 00:00:01.\nBuilt for Python 3.9 on linux (64-bit).\n'

Here is the debug log from the *eaf* buffer when I attempted to load the first PDF:

Traceback (most recent call last):
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 1158, in paintEvent
    qpixmap = self.get_page_pixmap(index, self.scale * hidpi_scale_factor, self.rotation)
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 1079, in get_page_pixmap
    page = self.document[index]
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 475, in __getitem__
    page = PdfPage(self.document[index], index, self.document.isPDF)
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 584, in __init__
    self._page_rawdict = self._init_page_rawdict()
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 599, in _init_page_rawdict
    set_page_crop_box(self.page)(self.clip)
  File "/home/etest/.local/lib/python3.9/site-packages/fitz/fitz.py", line 6414, in set_cropbox
    return self._set_pagebox("CropBox", rect)
  File "/home/etest/.local/lib/python3.9/site-packages/fitz/fitz.py", line 6409, in _set_pagebox
    raise ValueError("rect not in mediabox")
ValueError: rect not in mediabox
QBackingStore::endPaint() called with active painter; did you forget to destroy it or call QPainter::end() on it?
QPainter::end: Painter ended with 2 saved states
Traceback (most recent call last):
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 1158, in paintEvent
    qpixmap = self.get_page_pixmap(index, self.scale * hidpi_scale_factor, self.rotation)
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 1079, in get_page_pixmap
    page = self.document[index]
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 475, in __getitem__
    page = PdfPage(self.document[index], index, self.document.isPDF)
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 584, in __init__
    self._page_rawdict = self._init_page_rawdict()
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 599, in _init_page_rawdict
    set_page_crop_box(self.page)(self.clip)
  File "/home/etest/.local/lib/python3.9/site-packages/fitz/fitz.py", line 6414, in set_cropbox
    return self._set_pagebox("CropBox", rect)
  File "/home/etest/.local/lib/python3.9/site-packages/fitz/fitz.py", line 6409, in _set_pagebox
    raise ValueError("rect not in mediabox")
ValueError: rect not in mediabox
QBackingStore::endPaint() called with active painter; did you forget to destroy it or call QPainter::end() on it?
QPainter::end: Painter ended with 2 saved states
Traceback (most recent call last):
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 2031, in eventFilter
    self.check_annot()
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 1856, in check_annot
    page = self.document[page_index]
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 475, in __getitem__
    page = PdfPage(self.document[index], index, self.document.isPDF)
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 584, in __init__
    self._page_rawdict = self._init_page_rawdict()
  File "/home/etest/emacs-application-framework/app/pdf-viewer/buffer.py", line 599, in _init_page_rawdict
    set_page_crop_box(self.page)(self.clip)
  File "/home/etest/.local/lib/python3.9/site-packages/fitz/fitz.py", line 6414, in set_cropbox
    return self._set_pagebox("CropBox", rect)
  File "/home/etest/.local/lib/python3.9/site-packages/fitz/fitz.py", line 6409, in _set_pagebox
    raise ValueError("rect not in mediabox")
ValueError: rect not in mediabox
Saved session:  /home/etest/emacs-application-framework/app/pdf-viewer/buffer.py /home/etest/Downloads/mwb_E_202201.pdf 2087.6365062761474:1.4781746031746033:fit_to_width:True:0

As before, the first PDF does not render in the emacs buffer.

Jdogzz commented 2 years ago

As a small update, with some further testing I have found there was a regression in going from pymupdf 1.19.3 and 1.19.4. When using pymupdf 1.19.3 I am able to successfully open the first PDF in eaf-pdf-viewer, and when using pymupdf 1.19.4 and later versions I get the above error.

Jdogzz commented 2 years ago

With some more testing I was able to pinpoint the bug as an apparent mismatch of cropbox and mediabox used by pymupdf, I have opened a bug report here: https://github.com/pymupdf/PyMuPDF/issues/1615

Jdogzz commented 2 years ago

The PyMuPDF author corrected my understanding of the issue, that it was related to handling non-integer sizes of the boxes, and solved that half of the problem in https://github.com/pymupdf/PyMuPDF/issues/1616 The other half seems to be related to an unnecessary transformation which I believe I have addressed in #65.

luhuaei commented 2 years ago

The PyMuPDF author corrected my understanding of the issue, that it was related to handling non-integer sizes of the boxes, and solved that half of the problem in pymupdf/PyMuPDF#1616 The other half seems to be related to an unnecessary transformation which I believe I have addressed in #65.

Thanks!

luhuaei commented 2 years ago

On my testing, This problem already fixed. I close this issue.