py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
https://pypdf.readthedocs.io/en/latest/
Other
8.41k stars 1.42k forks source link

BUG: infinite loop on damaged pdf file #2877

Closed pubpub-zz closed 1 month ago

pubpub-zz commented 1 month ago
          (for issue creation/tracking)

Note : With the latest dev build

an infinite loop occurs when getting number of pages https://github.com/user-attachments/files/17162634/7f40cb209fb97d1782bffcefc5e7be40.pdf

Originally posted by @pubpub-zz in https://github.com/py-pdf/pypdf/issues/2876#issuecomment-2379649591

pubpub-zz commented 1 month ago

the pdf contains some dictionnaries where the << and >> are not balanced. this creates an infinite loop when parsing file

stefan6419846 commented 1 month ago

How is this different from #2876?

pubpub-zz commented 1 month ago

from my understanding, #2876 deals with memory. Here, the infinite has been generated because of #2872. maybe at the end the fix will cover both.