Closed himanshk96 closed 5 years ago
This should be solved with the latest version of textract, 1.6.2
. Can you try this version? See also #26
I'm closing this issue due to inactivity. If you still encounter the issue with the latest version of textract, feel free to leave a comment with additional information and I'll reopen the issue.
I found this error while running a simple textract.process method
Error:
Traceback (most recent call last): File "textractfile.py", line 33, in
text = textract.process(p, method='tesseract', language='eng')
File "G:\softwares\textract-master\textract-master\textract\parsers__init__.p
y", line 77, in process
return parser.process(filename, encoding, kwargs)
File "G:\softwares\textract-master\textract-master\textract\parsers\utils.py",
line 46, in process
byte_string = self.extract(filename, kwargs)
File "G:\softwares\textract-master\textract-master\textract\parsers\pdf_parser
.py", line 33, in extract
return self.extract_tesseract(filename, **kwargs)
File "G:\softwares\textract-master\textract-master\textract\parsers\pdf_parser
.py", line 57, in extracttesseract
stdout, = self.run(['pdftoppm', filename, base])
File "G:\softwares\textract-master\textract-master\textract\parsers\utils.py",
line 91, in run
' '.join(args), 127, '', '',
textract.exceptions.ShellError: The command
pdftoppm C:\Users\himansh\Desktop\I BM Day 28-29 Sept.pdf c:\users\himansh\appdata\local\temp\tmpkydwyc\conv
failed with exit code 127 ------------- stdout ------------- ------------- stderr ----------------------I see this error