Open decsousa opened 11 months ago
+1
+1
Following
+1
To make it work I had to:
at the file .../site-packages/unstructured/partition/auto.py
add the line: from unstructured.partition.pdf import partition_pdf
then pip3 install pdf2image pdfminer.six
last if you have macOS, search 'Install Certificates.command' in the finder and open it.
Then do the following steps in the terminal:
python3
import nltk
nltk.download()
Downgrading to version 0.7.12 resolved the problem for me. You can do this by running the following command in your virtual environment:
pip install unstructured==0.7.12
pip install unstructured==0.7.12 works
To make it work I had to:
at the file
.../site-packages/unstructured/partition/auto.py
add the line:
from unstructured.partition.pdf import partition_pdf
then
pip3 install pdf2image pdfminer.six
last if you have macOS, search 'Install Certificates.command' in the finder and open it.
Then do the following steps in the terminal:
python3 import nltk nltk.download()
I tried this but then I got this error:
File "/Users/wangzhi/anaconda3/envs/chat/lib/python3.12/site-packages/langchain_community/document_loaders/unstructured.py", line 168, in _get_elements
from unstructured.partition.auto import partition
File "/Users/wangzhi/anaconda3/envs/chat/lib/python3.12/site-packages/unstructured/partition/auto.py", line 28, in
any ideas please? @3dylson
Hello, when I try to run the code the following error is displayed:
Traceback (most recent call last): File "C:\Users\Diego Sousa\Desktop\botchatgpt\botchatgpt\chat02.py", line 35, in
index = VectorstoreIndexCreator().from_loaders([loader])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\indexes\vectorstore.py", line 72, in from_loaders
docs.extend(loader.load())
^^^^^^^^^^^^^
File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\document_loaders\directory.py", line 137, in load
self.load_file(i, p, docs, pbar)
File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\document_loaders\directory.py", line 94, in load_file
raise e
File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\document_loaders\directory.py", line 88, in load_file
sub_docs = self.loader_cls(str(item), self.loader_kwargs).load()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\document_loaders\unstructured.py", line 86, in load
elements = self._get_elements()
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\document_loaders\unstructured.py", line 171, in _get_elements
return partition(filename=self.file_path, self.unstructured_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Diego Sousa\AppData\Local\Programs\Python\Python311\Lib\site-packages\unstructured\partition\auto.py", line 221, in partition elements = partition_pdf( ^^^^^^^^^^^^^ NameError: name 'partition_pdf' is not defined. Did you mean: 'partition_xml'?
has anyone had this same problem?