jorisschellekens / borb

borb is a library for reading, creating and manipulating PDF files in python.
https://borbpdf.com/
Other
3.4k stars 147 forks source link

Does Borb have a find and replace text feature #122

Closed akgv04 closed 2 years ago

akgv04 commented 2 years ago

I want to be able to edit a pdf file. Essentially I want to search and replace text in pdf files. Does borb support this feature?

jorisschellekens commented 2 years ago

Currently borb does not support this feature, and to the best of my knowledge neither does any other pdf library.

The problem here is that a pdf document (at minimum) does not contain any structure information. Or, to put it simply, the letters in a paragraph don't know they belong to the same paragraph.

That means when you're trying to do a find/replace, you run into the following issue:

And if you're replacing a short word by a longer word, the problem becomes even worse.

Then you really need extra space, and you might need to move around more than just 1 paragraph.