Closed ragebear00 closed 1 month ago
Incremental saves are not always possible: a number of situations will prevent this. Instead of trying / excepting you can check doc.can_save_incrementally()
and only do an incremental save if True
is returned.
The following script works flawlessly:
import pymupdf as fitz
doc=fitz.open("1.-.Copy.pdf")
doc.can_save_incrementally()
1
doc.is_repaired
False
for page in doc:
_=page.get_text()
print(f"Processed {page.number=}")
MuPDF error: format error: object (44 0 R) was not found in its object stream
MuPDF error: format error: object (35 0 R) was not found in its object stream
Processed page.number=0
Processed page.number=1
Processed page.number=2
Processed page.number=3
Processed page.number=4
Processed page.number=5
Processed page.number=6
Processed page.number=7
doc.is_repaired
True
doc.can_save_incrementally()
0
doc.ez_save("temp.pdf")
doc.close()
import os
os.remove(doc.name)
After a failing incremental save, obviously an additional reference count is added to the file handle which is not removed on closing the document. This happens on Windows only - no problem on Linux at least.
We will look into this. In the meantime, please use doc = None
or del doc
after closing. This will triger an additional reference count reduction and the removal will succeed.
many thanks! As you did again the beginning, doc.can_save_incrementally() is 1 before get_text(), just not realize the get_text() will cause "repair" and doc.can_save_incrementally() = false. Will check again before every saveIncr in the future.
Description of the bug
the pdf cannot be saved after get_text(), and cannot close until close the python entirely.
The errror occurs in 1.23.26 and 1.24.4
errors are
Traceback (most recent call last): File "C:/_a/test.py", line 11, in
doc.saveIncr()
File "C:\Users\x\AppData\Local\Programs\Python\Python310\lib\site-packages\fitz__init.py", line 5380, in saveIncr
return self.save(self.name, incremental=True, encryption=mupdf.PDF_ENCRYPT_KEEP)
File "C:\Users\x\AppData\Local\Programs\Python\Python310\lib\site-packages\fitz\init__.py", line 5301, in save
return extra.Document_save(
File "C:\Users\x\AppData\Local\Programs\Python\Python310\lib\site-packages\fitz\extra.py", line 120, in Document_save
return _extra.Document_save(*args)
RuntimeError: code=2: Can't do incremental writes on a repaired file
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:/_a/test.py", line 15, in
os.remove(path)
PermissionError: [WinError 32] The process cannot access the file because it is being used by anoth
How to reproduce the bug
run the code below with the pdf uploaded
`import os import fitz
path = r"C:_a\issue\1 - Copy.pdf" doc = fitz.open(path)
for page in doc: print(page.get_text())
try: doc.saveIncr() except: doc.save(path+'temp.pdf',deflate=True, garbage=3) doc.close() os.remove(path) os.rename(path+'temp.pdf',path)
1 - Copy.pdf `
PyMuPDF version
1.24.4
Operating system
Windows
Python version
3.10