Open shivams opened 9 years ago
I met similar error under Windows environment if the path of PDF file contained "Non-Latin characters", such as Chinese. But if I move the PDF file to the path without Chinese, it works.
I met similar error under Windows environment if the path of PDF file contained "Non-Latin characters", such as Chinese. But if I move the PDF file to the path without Chinese, it works.
Thanks! That is a very useful comment. The path I had problem with had whitespace. I moved the files some other path that doesn't have whitespace.
I met similar error under Windows environment if the path of PDF file contained "Non-Latin characters", such as Chinese. But if I move the PDF file to the path without Chinese, it works.
When I changed the path, I could also combine my files. Thank you!
For some PDF files, pdftk throws this error:
This bug has been reported on pdftk launchpad: https://bugs.launchpad.net/ubuntu/+source/pdftk/+bug/774052
It seems like the bug hasn't been fixed. Due to this bug, pdfocr.rb also fails on many occasions. However, there is a temporary solution that I have. The solution is something like this:
Sometimes, pdftk completely fails to read certain types of PDFs. However, if we read those PDFs using some other tool and then recreate them, then pdftk will read the newly created PDF just fine. E.g. we can use ghostscript to recreate pdf like this:
Now pdftk will read the newly created PDF file just fine.
If someone is willing to apply this solution, then it'd be really good. Otherwise I will make the changes myself and send a pull request.
PS: A sample file which fails to be read is given here: https://www.jstage.jst.go.jp/article/jsmec/45/3/45_3_730/_pdf