Closed JaMe76 closed 1 month ago
can you please update to latest version and confirm results.
Yes, same results after upgrading to pypdf 5.0.0
When getting the number of pages, pypdf requires to load/expand objects(pages and objects below) in memory. Why do you consider a memory overflow ?
(for issue creation/tracking) Note : With the latest dev build
an infinite loop occurs when getting number of pages https://github.com/user-attachments/files/17162634/7f40cb209fb97d1782bffcefc5e7be40.pdf
When getting the number of pages, pypdf requires to load/expand objects(pages and objects below) in memory. Why do you consider a memory overflow ?
Thanks for now.
This was the termination message of the process after trying to open the file for about three hours.
As this might be related to #2877, I will check if the latest fix will also close this one here.
This was the termination message of the process after trying to open the file for about three hours.
This sounds like the same issue and my fix will work : have a look here to test the latest dev version: https://pypdf.readthedocs.io/en/stable/dev/testing.html#evaluate-a-pr-in-progress-version
Fix works: Thx again.
When running the code with the PDF that you can find in the attachment, I observe a consistent memory increase. This increase only stops once the process terminates.
I was able to trace back to the point to where the memory consumption starts: You can find it in the screenshot below. I even observe an increase of consumption AFTER this step in a debug mode.
N.B. The PDF seems to be corrupted after p.27.
Environment
Which environment were you using when you encountered the problem?
Code + PDF
This is a minimal, complete example that shows the issue:
7f40cb209fb97d1782bffcefc5e7be40.pdf
Traceback