pmaupin / pdfrw

pdfrw is a pure Python library that reads and writes PDFs
Other
1.84k stars 271 forks source link

TestOnePdf.test_*_72eb207b8f882618899aa7a65d3cecda.pdf fail on py3.7+ #199

Open mgorny opened 4 years ago

mgorny commented 4 years ago

In addition to #197 and #198, the following tests fail on Python 3.7 and newer. This seems to be the same problem as #145, except that it happens with included test file rather than random broken input.

The underlying problem seems to be uncaught exception from .next() but I don't really understand the code enough to figure out what should happen when there's no next item to be yielded. Besides, there's one raise StopIteration elsewhere in the code that needs to be replaced with return but I haven't submitted PR since I have no clue how to fix this one.

_________________________________________ TestOnePdf.test_repaginate_72eb207b8f882618899aa7a65d3cecda.pdf _________________________________________

self = <tests.test_roundtrip.TestOnePdf testMethod=test_repaginate_72eb207b8f882618899aa7a65d3cecda.pdf>

    def test(self):
>       self.roundtrip(*args, **kw)

../../test_roundtrip.py:110: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../test_roundtrip.py:83: in roundtrip
    writer.write()
../../../pdfrw/pdfwriter.py:358: in write
    self.killobj, user_fmt=user_fmt)
../../../pdfrw/pdfwriter.py:196: in FormatObjects
    format_deferred()
../../../pdfrw/pdfwriter.py:164: in format_deferred
    objlist[index] = format_obj(obj)
../../../pdfrw/pdfwriter.py:145: in format_obj
    myarray.append(add(value))
../../../pdfrw/pdfwriter.py:83: in add
    result = format_obj(obj)
../../../pdfrw/pdfwriter.py:145: in format_obj
    myarray.append(add(value))
../../../pdfrw/pdfwriter.py:83: in add
    result = format_obj(obj)
../../../pdfrw/pdfwriter.py:141: in format_obj
    for (x, y) in obj.iteritems())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

.0 = <generator object PdfDict.iteritems at 0x7f379e403250>

>   pairs = sorted((getattr(x, 'encoded', None) or x, y)
                   for (x, y) in obj.iteritems())
E   RuntimeError: generator raised StopIteration

../../../pdfrw/pdfwriter.py:140: RuntimeError
-------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------
[ERROR] tokens.py:226 stream /Length attribute (270435) appears to be too big (size 250435) -- adjusting (line=20035, col=1)
---------------------------------------------------------------- Captured log call ----------------------------------------------------------------
ERROR    pdfrw:tokens.py:226 stream /Length attribute (270435) appears to be too big (size 250435) -- adjusting (line=20035, col=1)
___________________________________________ TestOnePdf.test_simple_72eb207b8f882618899aa7a65d3cecda.pdf ___________________________________________

self = <tests.test_roundtrip.TestOnePdf testMethod=test_simple_72eb207b8f882618899aa7a65d3cecda.pdf>

    def test(self):
>       self.roundtrip(*args, **kw)

../../test_roundtrip.py:110: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../test_roundtrip.py:83: in roundtrip
    writer.write()
../../../pdfrw/pdfwriter.py:358: in write
    self.killobj, user_fmt=user_fmt)
../../../pdfrw/pdfwriter.py:196: in FormatObjects
    format_deferred()
../../../pdfrw/pdfwriter.py:164: in format_deferred
    objlist[index] = format_obj(obj)
../../../pdfrw/pdfwriter.py:145: in format_obj
    myarray.append(add(value))
../../../pdfrw/pdfwriter.py:83: in add
    result = format_obj(obj)
../../../pdfrw/pdfwriter.py:145: in format_obj
    myarray.append(add(value))
../../../pdfrw/pdfwriter.py:83: in add
    result = format_obj(obj)
../../../pdfrw/pdfwriter.py:141: in format_obj
    for (x, y) in obj.iteritems())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

.0 = <generator object PdfDict.iteritems at 0x7f379ecb4350>

>   pairs = sorted((getattr(x, 'encoded', None) or x, y)
                   for (x, y) in obj.iteritems())
E   RuntimeError: generator raised StopIteration

../../../pdfrw/pdfwriter.py:140: RuntimeError
-------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------
[ERROR] tokens.py:226 stream /Length attribute (270435) appears to be too big (size 250435) -- adjusting (line=20035, col=1)
---------------------------------------------------------------- Captured log call ----------------------------------------------------------------
ERROR    pdfrw:tokens.py:226 stream /Length attribute (270435) appears to be too big (size 250435) -- adjusting (line=20035, col=1)