Open ankrgyl opened 2 years ago
I am facing an error with the pdf2image library and mentioning to install Poppler to PATH. This is my code:
def doc_type(temp_path):
p = pipeline('document-question-answering')
doc = document.load_document(temp_path)
response = p("What type of document is this?", **doc.context)
return response
The error I receive is :
response = p("What type of document is this?", **doc.context) ^^^^^^^^^^^^ File "C:\Users\Cirruslabs\AppData\Local\Programs\Python\Python311\Lib\functools.py", line 1001, in __get__ val = self.func(instance) ^^^^^^^^^^^^^^^^^^^ File "C:\Users\Cirruslabs\Documents\GitHub\Document-Processing-BE\venv\Lib\site-packages\docquery\document.py", line 117, in context images = self._images ^^^^^^^^^^^^ File "C:\Users\Cirruslabs\AppData\Local\Programs\Python\Python311\Lib\functools.py", line 1001, in __get__ val = self.func(instance) ^^^^^^^^^^^^^^^^^^^ File "C:\Users\Cirruslabs\Documents\GitHub\Document-Processing-BE\venv\Lib\site-packages\docquery\document.py", line 156, in _images return [x.convert("RGB") for x in pdf2image.convert_from_bytes(self.b)] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Cirruslabs\Documents\GitHub\Document-Processing-BE\venv\Lib\site-packages\pdf2image\pdf2image.py", line 358, in convert_from_bytes return convert_from_path( ^^^^^^^^^^^^^^^^^^ File "C:\Users\Cirruslabs\Documents\GitHub\Document-Processing-BE\venv\Lib\site-packages\pdf2image\pdf2image.py", line 127, in convert_from_path page_count = pdfinfo_from_path( ^^^^^^^^^^^^^^^^^^ File "C:\Users\Cirruslabs\Documents\GitHub\Document-Processing-BE\venv\Lib\site-packages\pdf2image\pdf2image.py", line 594, in pdfinfo_from_path raise PDFInfoNotInstalledError( pdf2image.exceptions.PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?
Is there any workaround to this. I've tried installing popper-utils and pdf2image and still no use.
PDFs take advantage of Poppler to create image previews; however, these are unnecessary if the file has embedded text for certain models (e.g. LayoutLMv1). We should make sure that the default scenario of poppler not being available still works.