techleadhd / chatgpt-retrieval

1.68k stars 802 forks source link

ImportError: cannot import name 'extract_pages' from 'pdfminer.high_level' #25

Open AbubakrChan opened 1 year ago

AbubakrChan commented 1 year ago

Getting this error IDK y:

Traceback (most recent call last): File "C:\Users\l\streamlit-google-oauth\chatgpt-retrieval\chatgpt.py", line 38, in index = VectorstoreIndexCreator().from_loaders([loader]) File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\indexes\vectorstore.py", line 72, in from_loaders docs.extend(loader.load()) File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 108, in load self.load_file(i, p, docs, pbar) File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 69, in load_file raise e File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 63, in load_file File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 71, in load elements = self._get_elements() File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 106, in _get_elements from unstructured.partition.auto import partition File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\auto.py", line 21, in from unstructured.partition.image import partition_image File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\image.py", line 5, in from unstructured.partition.pdf import partition_pdf_or_image from pdfminer.high_level import extract_pages ImportError: cannot import name 'extract_pages' from 'pdfminer.high_level' (C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\pdfminer\high_level.py) PS C:\Users\l\streamlit-google-oauth\chatgpt-retrieval> ^C PS C:\Users\l\streamlit-google-oauth\chatgpt-retrieval> pip install pdfminer.six Requirement already satisfied: pdfminer.six in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (20191110)
Requirement already satisfied: pycryptodome in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (3.17) Requirement already satisfied: sortedcontainers in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (2.4.0) Requirement already satisfied: chardet in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (3.0.4) Requirement already satisfied: six in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (1.16.0)

[notice] A new release of pip is available: 23.0.1 -> 23.2.1 [notice] To update, run: python.exe -m pip install --upgrade pip PS C:\Users\l\streamlit-google-oauth\chatgpt-retrieval> python chatgpt.py "what is my dog's name" Traceback (most recent call last): File "C:\Users\l\streamlit-google-oauth\chatgpt-retrieval\chatgpt.py", line 38, in index = VectorstoreIndexCreator().from_loaders([loader]) File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\indexes\vectorstore.py", line 72, in from_loaders docs.extend(loader.load()) File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 108, in load self.load_file(i, p, docs, pbar) File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 69, in load_file raise e File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 63, in load_file sub_docs = self.loader_cls(str(item), **self.loader_kwargs).load() File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 71, in load elements = self._get_elements() File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 106, in _get_elements from unstructured.partition.auto import partition File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\auto.py", line 21, in from unstructured.partition.image import partition_image File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\image.py", line 5, in from unstructured.partition.pdf import partition_pdf_or_image File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\pdf.py", line 9, in from pdfminer.high_level import extract_pages ImportError: cannot import name 'extract_pages' from 'pdfminer.high_level' (C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\pdfminer\high_level.py)

3dylson commented 1 year ago

https://github.com/techleadhd/chatgpt-retrieval/issues/32#issuecomment-1666569313