Closed dameyerdave closed 1 week ago
Please provide all mandatory information - in this case, the reproducing file is missing.
I'm sorry for that. These are the files:
Thanks for the examples. Sorry I cannot find a problem. Made a redaction to remove "David Meyer" and it simply worked!
for r in page.search_for("david meyer"):
page.add_redact_annot(r)
'Redact' annotation on page 0 of original.pdf
page.apply_redactions(0,0,0)
True
doc.ez_save("x-1.24.2.pdf")
In the meantime, I also redacted other parts of the page (the text "October 19, 2023"
) , and they also worked without complaints.
HI @JorjMcKie I have faced the same issue while applying Redaction. they remove images which should not be removed or changing text. test.pdf test2.pdf
I have attached both pdf to reproduce the issue
Code:-
import fitz
from pathlib import Path
file_path=Path(r"test_pages/test.pdf")
doc=fitz.open(file_path)
page=doc[0]
blocks=page.get_text("rawdict",flags=fitz.TEXTFLAGS_TEXT,sort=True)["blocks"]
#Set Colour for outoput PDF
Red = fitz.pdfcolor["red"]
for b in blocks:
for l in b["lines"]:
for s in l["spans"]:
for c in s["chars"]:
if s["size"]>15 and s['color']==2236191:
if c['c']== "ं":
try:
font = fitz.Font(fontname=s['font'],fontfile=f"{s['font']}.ttf") # this must be known somehow - or simply try some font else
except Exception as e:
print(str(e))
redact_box = fitz.Rect(c["bbox"])
origin_text = fitz.Point(c["origin"])
redact_box.y1 = redact_box.y1-s['size']
page.add_redact_annot(redact_box)
# Apply reactions after all text replacements
page.apply_redactions(images=fitz.PDF_REDACT_IMAGE_NONE,graphics=fitz.PDF_REDACT_LINE_ART_NONE)
# Create Text writer to Write in Page with choose Color
tw = fitz.TextWriter(page.rect,color=Red)
#re-insert same text - different color
tw.append((origin_text.x,origin_text.y), text=c['c'],fontsize=s['size'],font=font)
tw.write_text(page)
#Saving Backup File furture use
out_fpath="OUT/"+file_path.stem+".pdf"
doc.save(out_fpath,garbage=3, deflate=True)
doc.close()
PyMuPDF version 1.24.2
Operating system windows
Python version 3.11.4
@aleem75321 please submit this as a different issue - this is too confusing in this context. When you do, please save the PDF when you have inserted all redactions - before applying them. I need to confirm where your code has put them - without the need to understand your code. Then attach this PDF to confirm that bad things happen on applying redactions.
I have summited different issues please see the below link.
Facing Issues after applying redactions they delete some Images or Icons #3439
I reduced the application to the bare minimum. I still encounter the same issue. I tried it on mac M3 and on ubuntu linux (Intel) as well as in a docker container with platform: linux/amd64
without success.
import fitz
doc = fitz.open("./original.pdf")
for page in doc:
for r in page.search_for("David Meyer"):
page.add_redact_annot(r)
page.apply_redactions(0, 0)
doc.ez_save("redacted.pdf")
With the following files:
I don't know what to try now... If you have another good idea, please let me know...
@dameyerdave we (a colleague of mine and I) have tried on all 3 platforms now Mac, Linux, Win with fitz.version=('1.24.2', '1.24.1', '20240417000001')
and are getting the correct, flawless result.
🤷♂️
That is no black rectangle and "David Meyer" removed in total.
My only advice is to re-install 1.24.2. There has been a redaction issue previously. I will try with 1 or 2 previous versions.
No such luck: At least on windows, all versions back to 1.23.26 do work correctly. So you probably best re-install with the latest version.
We are facing exactly the same as everybody posting the bug in this thread. Our version in the env is Name: PyMuPDF Version: 1.24.0
I tried removing the apply_redaction(images=0) and also used all the combos possible for the parameter. Also tried removing garbage collectors, and deflates when saving.
Exactly the same error as other people:
Original PDF before redaction
After apply.redaction to text "Origin"
We would love to know if you are aware of this bug, and if there is a stable version that works properly without this bug. Thanks a lot!
Another example. Now tested 3 versions: 1.24.0, 1.24.2 failing.
1.23.26: working well ! redaction works
Original before redaction:
After text redacted 1.24.0 and 1.24.2:
after text redacted with 1.23.26 (working!)
@luchux - "A picture is worth a thousand words."
Certainly true. But a thousand pictures are not worth a million words! Please add an example file and no more pictures if we should confirm that yours is another duplicate of #3376.
Please also note, that the problem of this post is yet not reproducible and thus unclear whether it is a bug at all.
Closing this for lack of information since a long time.
Description of the bug
As soon as I apply the reductions all the text and graphics get lost from the pdf.
Source:
Annotated:
After
apply_reductions()
:How to reproduce the bug
This is the code I wrote to come tho this:
PyMuPDF version
1.24.2
Operating system
MacOS
Python version
3.10