deanmalmgren / textract

extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k stars 599 forks source link

xyz.pdf c:\users\himansh\appdata\local\temp\tmpkydwyc\conv` failed with exit code 127 ------------- stdout ------------- #255

Closed himanshk96 closed 5 years ago

himanshk96 commented 5 years ago

I found this error while running a simple textract.process method

Error:

Traceback (most recent call last): File "textractfile.py", line 33, in text = textract.process(p, method='tesseract', language='eng') File "G:\softwares\textract-master\textract-master\textract\parsers__init__.p y", line 77, in process return parser.process(filename, encoding, kwargs) File "G:\softwares\textract-master\textract-master\textract\parsers\utils.py", line 46, in process byte_string = self.extract(filename, kwargs) File "G:\softwares\textract-master\textract-master\textract\parsers\pdf_parser .py", line 33, in extract return self.extract_tesseract(filename, **kwargs) File "G:\softwares\textract-master\textract-master\textract\parsers\pdf_parser .py", line 57, in extracttesseract stdout, = self.run(['pdftoppm', filename, base]) File "G:\softwares\textract-master\textract-master\textract\parsers\utils.py", line 91, in run ' '.join(args), 127, '', '', textract.exceptions.ShellError: The command pdftoppm C:\Users\himansh\Desktop\I BM Day 28-29 Sept.pdf c:\users\himansh\appdata\local\temp\tmpkydwyc\conv failed with exit code 127 ------------- stdout ------------- ------------- stderr -------------

---------I see this error

jpweytjens commented 5 years ago

This should be solved with the latest version of textract, 1.6.2. Can you try this version? See also #26

jpweytjens commented 5 years ago

I'm closing this issue due to inactivity. If you still encounter the issue with the latest version of textract, feel free to leave a comment with additional information and I'll reopen the issue.