Open fletchy95 opened 4 years ago
If I had to guess, it’s due to your PDFs being a different format than the ones I was using this program on; there was an iteration of this code on my machine that used pdftotext, but it didn’t work for my PDFs so I assumed it would not work for any PDFs.
I will look into it and see if I can implement a way to toggle between it and PyPDF2 this weekend
The way that it currently works, it is superimposing a redaction onto an existing PDF, and then flattening out the underlying data. To use pdftotext, you'd have to know exactly what text strings you're redacting from the existing PDF, and then recreate a new, identical PDF to the original with those strings removed.
Can you explain a little bit more about your use case/how you're using PDFtotext??
I sometimes run in to the issue of PyPDF2 not working with certain pdfs but pdftotext does. Is there any plans/solutions to have this library run pdftotext instead or as an option?