JoshData / pdf-redactor

A general purpose PDF text-layer redaction tool for Python 2/3.
Creative Commons Zero v1.0 Universal
183 stars 61 forks source link

Overlapping of Text #20

Open kapilnakra opened 5 years ago

kapilnakra commented 5 years ago

Hi Joshua,

I am facing an issue while using Pdf Redactor.

I am replacing the word "GENIUS" with "wonderful" in a Pdf. I am using example.py for this purpose.

Issues:

  1. Overlap of "wonderful" with next world
  2. If "wonderful" is not getting overlapped, the next work is getting overlapped. You can see in the second paragraph.
  3. If the Line is becoming big, extra words are not shifting to the new line.

Original Text in PDF image

Redacted File image

I would be really grateful if you can help me in this regard. Looking forward to hearing from you.

Thanks Kapil Nakra

JoshData commented 5 years ago

These sorts of things happen when the PDF is generated by software that is pre-computing text layout within each line of text, rather than letting the PDF viewer handle text layout. There is nothing this library can do to help that because it isn't aware of the layout of the text that it is changing.

So for that reason, it's safer to choose replacement text that is likely to be shorter than the original.