LeoFCardoso / pdf2pdfocr

A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!
Apache License 2.0
266 stars 33 forks source link

Specify Output Folder using pdf2pdfocr.vbs #14

Closed der-klabauter closed 4 years ago

der-klabauter commented 4 years ago

Dear Leo,

I love your project but would like to directly push the OCRed files to a new directory. Therefor I tried to add amend the defaultoptions: default_option = "-stp -j 0.9 -o %Userprofile%" But no matter which directory I add, I always get a permission error:

Traceback (most recent call last): File "C:\Users\Christoph\pdf2pdfocr-venv\Scripts\pdf2pdfocr.py", line 1249, in pdf2ocr.ocr() File "C:\Users\Christoph\pdf2pdfocr-venv\Scripts\pdf2pdfocr.py", line 605, in ocr self.initial_cleanup() File "C:\Users\Christoph\pdf2pdfocr-venv\Scripts\pdf2pdfocr.py", line 952, in initial_cleanup Pdf2PdfOcr.best_effort_remove(self.output_file) File "C:\Users\Christoph\pdf2pdfocr-venv\Scripts\pdf2pdfocr.py", line 1154, in best_effort_remove os.remove(filename) PermissionError: [WinError 5] Zugriff verweigert: 'C:\Users\Christoph'

Any ideas how to fix it?

Thank you so much.

BR Christoph

LeoFCardoso commented 4 years ago

Hello, thanks for the post. Please note that -o require a filename and not a destination directory. Please adjust your script to specify a file, instead of folder, maybe: -o %Userprofile%\final-file-ocred.pdf

LeoFCardoso commented 4 years ago

Humm. I think I got your point. Are you using VBS for multiple files and would like just to specify a different folder for them all, keeping the original file name, right?

der-klabauter commented 4 years ago

Yep, that was the idea. I wanted to push the files directly to my NAS after OCR. In the current configuration this is not that simple because I would need to extract the filenames in the vbs script. It would be great if you could add a flag in pdf2pdfocr.py that allows to modify the output path while keeping the filenames.

LeoFCardoso commented 4 years ago

I will create this use case. By now, you can work with current directory and copy files manually to your NAS using the default file output name. Do not use "-o", let pdf2pdfocr create result files in current folder and then issue: "move *-OCR.pdf <>".

LeoFCardoso commented 4 years ago

@der-klabauter please let me know if it works for you.

der-klabauter commented 4 years ago

works well, thanks :)