pmaupin / pdfrw

pdfrw is a pure Python library that reads and writes PDFs
Other
1.84k stars 271 forks source link

Getting RecursionError while trying to compare 2 PDFs parsed with PdfReader #212

Open Lucas-C opened 3 years ago

Lucas-C commented 3 years ago

Minimal reproduction case:

$ wget -q https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf

$ cat repro.py
from pdfrw import PdfReader
pdf1 = PdfReader('dummy.pdf')
pdf2 = PdfReader('dummy.pdf')
assert pdf1 == pdf2

$ python repro.py
Traceback (most recent call last):
  File "repro.py", line 4, in <module>
    assert pdf1 == pdf2
RecursionError: maximum recursion depth exceeded in comparison

Being able to compare that 2 PDFs are identical this way would be awesome! What do you think? Do you see why this RecursionError is happening?