run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.58k stars 5.23k forks source link

[Bug]: loading docx file with SimpleDirectoryReader : File is not a zip file. Skipping... #11859

Closed osafaimal closed 4 months ago

osafaimal commented 8 months ago

Bug Description

for all my .docx file i have this error: Failed to load file [...].docx with error: File is not a zip file. Skipping... for loading i use: documents = SimpleDirectoryReader("data",recursive=True).load_data() i have installed docx2txt

Version

0.10.18

Steps to Reproduce

just do that (i think at least when i just do that i have the error):

!pip install -U docx2txt ipywidgets llama_index
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("data",recursive=True).load_data()

Relevant Logs/Tracbacks

Requirement already satisfied: docx2txt in ./.venv/lib/python3.10/site-packages (0.8)
Requirement already satisfied: ipywidgets in ./.venv/lib/python3.10/site-packages (8.1.2)
Requirement already satisfied: llama_index in ./.venv/lib/python3.10/site-packages (0.10.18)
Requirement already satisfied: comm>=0.1.3 in ./.venv/lib/python3.10/site-packages (from ipywidgets) (0.2.1)
Requirement already satisfied: ipython>=6.1.0 in ./.venv/lib/python3.10/site-packages (from ipywidgets) (8.22.1)
Requirement already satisfied: traitlets>=4.3.1 in ./.venv/lib/python3.10/site-packages (from ipywidgets) (5.14.1)
Requirement already satisfied: widgetsnbextension~=4.0.10 in ./.venv/lib/python3.10/site-packages (from ipywidgets) (4.0.10)
Requirement already satisfied: jupyterlab-widgets~=3.0.10 in ./.venv/lib/python3.10/site-packages (from ipywidgets) (3.0.10)
Requirement already satisfied: llama-index-agent-openai<0.2.0,>=0.1.4 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.5)
Requirement already satisfied: llama-index-cli<0.2.0,>=0.1.2 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.6)
Requirement already satisfied: llama-index-core<0.11.0,>=0.10.18 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.10.18.post1)
Requirement already satisfied: llama-index-embeddings-openai<0.2.0,>=0.1.5 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.6)
Requirement already satisfied: llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.3)
Requirement already satisfied: llama-index-legacy<0.10.0,>=0.9.48 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.9.48)
Requirement already satisfied: llama-index-llms-openai<0.2.0,>=0.1.5 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.6)
Requirement already satisfied: llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.4)
Requirement already satisfied: llama-index-program-openai<0.2.0,>=0.1.3 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.4)
Requirement already satisfied: llama-index-question-gen-openai<0.2.0,>=0.1.2 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.3)
Requirement already satisfied: llama-index-readers-file<0.2.0,>=0.1.4 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.6)
Requirement already satisfied: llama-index-readers-llama-parse<0.2.0,>=0.1.2 in ./.venv/lib/python3.10/site-packages (from llama_index) (0.1.3)
Requirement already satisfied: decorator in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (5.1.1)
Requirement already satisfied: jedi>=0.16 in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (0.19.1)
Requirement already satisfied: matplotlib-inline in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (0.1.6)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (3.0.43)
Requirement already satisfied: pygments>=2.4.0 in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (2.17.2)
Requirement already satisfied: stack-data in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (0.6.3)
Requirement already satisfied: exceptiongroup in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (1.2.0)
Requirement already satisfied: pexpect>4.3 in ./.venv/lib/python3.10/site-packages (from ipython>=6.1.0->ipywidgets) (4.9.0)
Requirement already satisfied: llama-index-vector-stores-chroma<0.2.0,>=0.1.1 in ./.venv/lib/python3.10/site-packages (from llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.1.5)
Requirement already satisfied: PyYAML>=6.0.1 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (6.0.1)
Requirement already satisfied: SQLAlchemy>=1.4.49 in ./.venv/lib/python3.10/site-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.18->llama_index) (2.0.27)
Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (3.9.3)
Requirement already satisfied: dataclasses-json in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (0.6.4)
Requirement already satisfied: deprecated>=1.2.9.3 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (1.2.14)
Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (1.0.8)
Requirement already satisfied: fsspec>=2023.5.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (2024.2.0)
Requirement already satisfied: httpx in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (0.25.2)
Requirement already satisfied: llamaindex-py-client<0.2.0,>=0.1.13 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (0.1.13)
Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (1.6.0)
Requirement already satisfied: networkx>=3.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (3.2.1)
Requirement already satisfied: nltk<4.0.0,>=3.8.1 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (3.8.1)
Requirement already satisfied: numpy in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (1.26.4)
Requirement already satisfied: openai>=1.1.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (1.13.3)
Requirement already satisfied: pandas in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (2.2.1)
Requirement already satisfied: pillow>=9.0.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (10.2.0)
Requirement already satisfied: requests>=2.31.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (2.31.0)
Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (8.2.3)
Requirement already satisfied: tiktoken>=0.3.3 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (0.6.0)
Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (4.66.2)
Requirement already satisfied: typing-extensions>=4.5.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (4.10.0)
Requirement already satisfied: typing-inspect>=0.8.0 in ./.venv/lib/python3.10/site-packages (from llama-index-core<0.11.0,>=0.10.18->llama_index) (0.9.0)
Requirement already satisfied: beautifulsoup4<5.0.0,>=4.12.3 in ./.venv/lib/python3.10/site-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama_index) (4.12.3)
Requirement already satisfied: bs4<0.0.3,>=0.0.2 in ./.venv/lib/python3.10/site-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama_index) (0.0.2)
Requirement already satisfied: pymupdf<2.0.0,>=1.23.21 in ./.venv/lib/python3.10/site-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama_index) (1.23.25)
Requirement already satisfied: pypdf<5.0.0,>=4.0.1 in ./.venv/lib/python3.10/site-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama_index) (4.0.2)
Requirement already satisfied: llama-parse<0.4.0,>=0.3.3 in ./.venv/lib/python3.10/site-packages (from llama-index-readers-llama-parse<0.2.0,>=0.1.2->llama_index) (0.3.5)
Requirement already satisfied: aiosignal>=1.1.2 in ./.venv/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in ./.venv/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.18->llama_index) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in ./.venv/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in ./.venv/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.18->llama_index) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in ./.venv/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in ./.venv/lib/python3.10/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.18->llama_index) (4.0.3)
Requirement already satisfied: soupsieve>1.2 in ./.venv/lib/python3.10/site-packages (from beautifulsoup4<5.0.0,>=4.12.3->llama-index-readers-file<0.2.0,>=0.1.4->llama_index) (2.5)
Requirement already satisfied: wrapt<2,>=1.10 in ./.venv/lib/python3.10/site-packages (from deprecated>=1.2.9.3->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.16.0)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in ./.venv/lib/python3.10/site-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets) (0.8.3)
Requirement already satisfied: chromadb<0.5.0,>=0.4.22 in ./.venv/lib/python3.10/site-packages (from llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.4.24)
Requirement already satisfied: onnxruntime<2.0.0,>=1.17.0 in ./.venv/lib/python3.10/site-packages (from llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.17.1)
Requirement already satisfied: tokenizers<0.16.0,>=0.15.1 in ./.venv/lib/python3.10/site-packages (from llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.15.2)
Requirement already satisfied: pydantic>=1.10 in ./.venv/lib/python3.10/site-packages (from llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.18->llama_index) (2.6.3)
Requirement already satisfied: anyio in ./.venv/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.18->llama_index) (4.3.0)
Requirement already satisfied: certifi in ./.venv/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.18->llama_index) (2024.2.2)
Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.0.4)
Requirement already satisfied: idna in ./.venv/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.18->llama_index) (3.6)
Requirement already satisfied: sniffio in ./.venv/lib/python3.10/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.3.1)
Requirement already satisfied: h11<0.15,>=0.13 in ./.venv/lib/python3.10/site-packages (from httpcore==1.*->httpx->llama-index-core<0.11.0,>=0.10.18->llama_index) (0.14.0)
Requirement already satisfied: click in ./.venv/lib/python3.10/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.18->llama_index) (8.1.7)
Requirement already satisfied: joblib in ./.venv/lib/python3.10/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.3.2)
Requirement already satisfied: regex>=2021.8.3 in ./.venv/lib/python3.10/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.18->llama_index) (2023.12.25)
Requirement already satisfied: distro<2,>=1.7.0 in ./.venv/lib/python3.10/site-packages (from openai>=1.1.0->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.9.0)
Requirement already satisfied: ptyprocess>=0.5 in ./.venv/lib/python3.10/site-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets) (0.7.0)
Requirement already satisfied: wcwidth in ./.venv/lib/python3.10/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=6.1.0->ipywidgets) (0.2.13)
Requirement already satisfied: PyMuPDFb==1.23.22 in ./.venv/lib/python3.10/site-packages (from pymupdf<2.0.0,>=1.23.21->llama-index-readers-file<0.2.0,>=0.1.4->llama_index) (1.23.22)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.venv/lib/python3.10/site-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.18->llama_index) (3.3.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.10/site-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.18->llama_index) (2.2.1)
Requirement already satisfied: greenlet!=0.4.17 in ./.venv/lib/python3.10/site-packages (from SQLAlchemy>=1.4.49->SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.18->llama_index) (3.0.3)
Requirement already satisfied: mypy-extensions>=0.3.0 in ./.venv/lib/python3.10/site-packages (from typing-inspect>=0.8.0->llama-index-core<0.11.0,>=0.10.18->llama_index) (1.0.0)
Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in ./.venv/lib/python3.10/site-packages (from dataclasses-json->llama-index-core<0.11.0,>=0.10.18->llama_index) (3.21.0)
Requirement already satisfied: python-dateutil>=2.8.2 in ./.venv/lib/python3.10/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.18->llama_index) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in ./.venv/lib/python3.10/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.18->llama_index) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in ./.venv/lib/python3.10/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.18->llama_index) (2024.1)
Requirement already satisfied: executing>=1.2.0 in ./.venv/lib/python3.10/site-packages (from stack-data->ipython>=6.1.0->ipywidgets) (2.0.1)
Requirement already satisfied: asttokens>=2.1.0 in ./.venv/lib/python3.10/site-packages (from stack-data->ipython>=6.1.0->ipywidgets) (2.4.1)
Requirement already satisfied: pure-eval in ./.venv/lib/python3.10/site-packages (from stack-data->ipython>=6.1.0->ipywidgets) (0.2.2)
Requirement already satisfied: six>=1.12.0 in ./.venv/lib/python3.10/site-packages (from asttokens>=2.1.0->stack-data->ipython>=6.1.0->ipywidgets) (1.16.0)
Requirement already satisfied: build>=1.0.3 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.0.3)
Requirement already satisfied: chroma-hnswlib==0.7.3 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.7.3)
Requirement already satisfied: fastapi>=0.95.2 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.110.0)
Requirement already satisfied: uvicorn>=0.18.3 in ./.venv/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.27.1)
Requirement already satisfied: posthog>=2.4.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.4.2)
Requirement already satisfied: pulsar-client>=3.1.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.4.0)
Requirement already satisfied: opentelemetry-api>=1.2.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.23.0)
Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc>=1.2.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.23.0)
Requirement already satisfied: opentelemetry-instrumentation-fastapi>=0.41b0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.44b0)
Requirement already satisfied: opentelemetry-sdk>=1.2.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.23.0)
Requirement already satisfied: pypika>=0.48.9 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.48.9)
Requirement already satisfied: overrides>=7.3.1 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (7.7.0)
Requirement already satisfied: importlib-resources in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (6.1.2)
Requirement already satisfied: grpcio>=1.58.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.60.0)
Requirement already satisfied: bcrypt>=4.0.1 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (4.1.2)
Requirement already satisfied: typer>=0.9.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.9.0)
Requirement already satisfied: kubernetes>=28.1.0 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (29.0.0)
Requirement already satisfied: mmh3>=4.0.1 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (4.1.0)
Requirement already satisfied: orjson>=3.9.12 in ./.venv/lib/python3.10/site-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.9.15)
Requirement already satisfied: packaging>=17.0 in ./.venv/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.11.0,>=0.10.18->llama_index) (23.2)
Requirement already satisfied: coloredlogs in ./.venv/lib/python3.10/site-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (15.0.1)
Requirement already satisfied: flatbuffers in ./.venv/lib/python3.10/site-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (23.5.26)
Requirement already satisfied: protobuf in ./.venv/lib/python3.10/site-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (4.25.3)
Requirement already satisfied: sympy in ./.venv/lib/python3.10/site-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.12)
Requirement already satisfied: annotated-types>=0.4.0 in ./.venv/lib/python3.10/site-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.18->llama_index) (0.6.0)
Requirement already satisfied: pydantic-core==2.16.3 in ./.venv/lib/python3.10/site-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.18->llama_index) (2.16.3)
Requirement already satisfied: huggingface_hub<1.0,>=0.16.4 in ./.venv/lib/python3.10/site-packages (from tokenizers<0.16.0,>=0.15.1->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.21.3)
Requirement already satisfied: pyproject_hooks in ./.venv/lib/python3.10/site-packages (from build>=1.0.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.0.0)
Requirement already satisfied: tomli>=1.1.0 in ./.venv/lib/python3.10/site-packages (from build>=1.0.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (2.0.1)
Requirement already satisfied: starlette<0.37.0,>=0.36.3 in ./.venv/lib/python3.10/site-packages (from fastapi>=0.95.2->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.36.3)
Requirement already satisfied: filelock in ./.venv/lib/python3.10/site-packages (from huggingface_hub<1.0,>=0.16.4->tokenizers<0.16.0,>=0.15.1->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.13.1)
Requirement already satisfied: google-auth>=1.0.1 in ./.venv/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (2.28.1)
Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in ./.venv/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.7.0)
Requirement already satisfied: requests-oauthlib in ./.venv/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.3.1)
Requirement already satisfied: oauthlib>=3.2.2 in ./.venv/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.2.2)
Requirement already satisfied: importlib-metadata<7.0,>=6.0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-api>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (6.11.0)
Requirement already satisfied: googleapis-common-protos~=1.52 in ./.venv/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.62.0)
Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.23.0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.23.0)
Requirement already satisfied: opentelemetry-proto==1.23.0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.23.0)
Requirement already satisfied: opentelemetry-instrumentation-asgi==0.44b0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.44b0)
Requirement already satisfied: opentelemetry-instrumentation==0.44b0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.44b0)
Requirement already satisfied: opentelemetry-semantic-conventions==0.44b0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.44b0)
Requirement already satisfied: opentelemetry-util-http==0.44b0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.44b0)
Requirement already satisfied: setuptools>=16.0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-instrumentation==0.44b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (69.1.1)
Requirement already satisfied: asgiref~=3.0 in ./.venv/lib/python3.10/site-packages (from opentelemetry-instrumentation-asgi==0.44b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.7.2)
Requirement already satisfied: monotonic>=1.5 in ./.venv/lib/python3.10/site-packages (from posthog>=2.4.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.6)
Requirement already satisfied: backoff>=1.10.0 in ./.venv/lib/python3.10/site-packages (from posthog>=2.4.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (2.2.1)
Requirement already satisfied: httptools>=0.5.0 in ./.venv/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.6.1)
Requirement already satisfied: python-dotenv>=0.13 in ./.venv/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.0.1)
Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in ./.venv/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.19.0)
Requirement already satisfied: watchfiles>=0.13 in ./.venv/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.21.0)
Requirement already satisfied: websockets>=10.4 in ./.venv/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (12.0)
Requirement already satisfied: humanfriendly>=9.1 in ./.venv/lib/python3.10/site-packages (from coloredlogs->onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (10.0)
Requirement already satisfied: mpmath>=0.19 in ./.venv/lib/python3.10/site-packages (from sympy->onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (1.3.0)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in ./.venv/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (5.3.3)
Requirement already satisfied: pyasn1-modules>=0.2.1 in ./.venv/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in ./.venv/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (4.9)
Requirement already satisfied: zipp>=0.5 in ./.venv/lib/python3.10/site-packages (from importlib-metadata<7.0,>=6.0->opentelemetry-api>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (3.17.0)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in ./.venv/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama_index) (0.5.1)
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/01E2A8944A0B4F4C8DA3B396AE60AF2A/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/01E2A8944A0B4F4C8DA3B396AE60AF2A/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/01E2A8944A0B4F4C8DA3B396AE60AF2A/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/01E2A8944A0B4F4C8DA3B396AE60AF2A/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/01E2A8944A0B4F4C8DA3B396AE60AF2A/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A590B7799374FEE8274CF1DAAEDABCA/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A590B7799374FEE8274CF1DAAEDABCA/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A590B7799374FEE8274CF1DAAEDABCA/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A590B7799374FEE8274CF1DAAEDABCA/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A590B7799374FEE8274CF1DAAEDABCA/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A957B7500CB4847B3B95AD9EE274191/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A957B7500CB4847B3B95AD9EE274191/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A957B7500CB4847B3B95AD9EE274191/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A957B7500CB4847B3B95AD9EE274191/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/4A957B7500CB4847B3B95AD9EE274191/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/55D0ED7C45364C62AAE3EDC2B7C6EA26/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/55D0ED7C45364C62AAE3EDC2B7C6EA26/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/55D0ED7C45364C62AAE3EDC2B7C6EA26/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/55D0ED7C45364C62AAE3EDC2B7C6EA26/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/55D0ED7C45364C62AAE3EDC2B7C6EA26/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/5DCF4EAC8474407CB9E3CE186D8C30E1/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/5DCF4EAC8474407CB9E3CE186D8C30E1/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/5DCF4EAC8474407CB9E3CE186D8C30E1/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/5DCF4EAC8474407CB9E3CE186D8C30E1/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/5DCF4EAC8474407CB9E3CE186D8C30E1/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/61B715AA44544B4A8B86F68480CB9F25/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/61B715AA44544B4A8B86F68480CB9F25/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/61B715AA44544B4A8B86F68480CB9F25/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/61B715AA44544B4A8B86F68480CB9F25/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/61B715AA44544B4A8B86F68480CB9F25/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/78C2E183D33343189F4AE19C21016226/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/78C2E183D33343189F4AE19C21016226/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/78C2E183D33343189F4AE19C21016226/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/78C2E183D33343189F4AE19C21016226/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/78C2E183D33343189F4AE19C21016226/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B42BEAF50ADE452DB0F0D2FCB33D703E/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B42BEAF50ADE452DB0F0D2FCB33D703E/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B42BEAF50ADE452DB0F0D2FCB33D703E/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B42BEAF50ADE452DB0F0D2FCB33D703E/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B42BEAF50ADE452DB0F0D2FCB33D703E/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B86F8E54C9044715ABABB6D8EDA81A3B/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B86F8E54C9044715ABABB6D8EDA81A3B/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B86F8E54C9044715ABABB6D8EDA81A3B/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B86F8E54C9044715ABABB6D8EDA81A3B/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/B86F8E54C9044715ABABB6D8EDA81A3B/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/CCCA2015B169432E93D1C0C835F57A14/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/CCCA2015B169432E93D1C0C835F57A14/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/CCCA2015B169432E93D1C0C835F57A14/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/CCCA2015B169432E93D1C0C835F57A14/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/CCCA2015B169432E93D1C0C835F57A14/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/D615C94EF09F49A8AF19E8C42D59D216/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/D615C94EF09F49A8AF19E8C42D59D216/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/D615C94EF09F49A8AF19E8C42D59D216/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/D615C94EF09F49A8AF19E8C42D59D216/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/D615C94EF09F49A8AF19E8C42D59D216/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/EDCF1935B646430C89A352036C3A2F64/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/EDCF1935B646430C89A352036C3A2F64/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/EDCF1935B646430C89A352036C3A2F64/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/EDCF1935B646430C89A352036C3A2F64/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/EDCF1935B646430C89A352036C3A2F64/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F5A4C9BE4F224F68AB3E241C517A7C7E/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F5A4C9BE4F224F68AB3E241C517A7C7E/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F5A4C9BE4F224F68AB3E241C517A7C7E/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F5A4C9BE4F224F68AB3E241C517A7C7E/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F5A4C9BE4F224F68AB3E241C517A7C7E/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F829B853D1B84C18B35972CC236488AB/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F829B853D1B84C18B35972CC236488AB/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F829B853D1B84C18B35972CC236488AB/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F829B853D1B84C18B35972CC236488AB/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/F829B853D1B84C18B35972CC236488AB/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/FB32CEF11EE140BB8DC0A5FDC3665753/EDCC2F3155D148AFBE37E9562855C8661.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/FB32CEF11EE140BB8DC0A5FDC3665753/EDCC2F3155D148AFBE37E9562855C8662.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/FB32CEF11EE140BB8DC0A5FDC3665753/EDCC2F3155D148AFBE37E9562855C8663.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/FB32CEF11EE140BB8DC0A5FDC3665753/EDCC2F3155D148AFBE37E9562855C8664.docx with error: File is not a zip file. Skipping...
Failed to load file /home/poclain/Documents/LLM_from_base/data/LLS/FB32CEF11EE140BB8DC0A5FDC3665753/EDCC2F3155D148AFBE37E9562855C8665.docx with error: File is not a zip file. Skipping...
Exception ignored in: <bound method IPythonKernel._clean_thread_parent_frames of <ipykernel.ipkernel.IPythonKernel object at 0x7ada63fe2e00>>
Traceback (most recent call last):
  File "/home/poclain/Documents/LLM_from_base/.venv/lib/python3.10/site-packages/ipykernel/ipkernel.py", line 770, in _clean_thread_parent_frames
    def _clean_thread_parent_frames(
KeyboardInterrupt:
dosubot[bot] commented 8 months ago

Hey @osafaimal, great to see you back! Hope you're doing well.

Based on the error message you're receiving, it seems like the issue might be related to the handling of the .docx file format during the file reading process. .docx files are essentially ZIP archives containing XML and other files that make up the document structure. The error message indicates that the process attempting to open the .docx file does not recognize it as a valid ZIP archive.

Here are a few steps you can take to troubleshoot this issue:

  1. Verify .docx File Integrity: Ensure that the .docx files you are trying to load are not corrupted and can be opened with standard document viewing software like Microsoft Word or LibreOffice Writer.

  2. Check DocxReader Implementation: The DocxReader class in LlamaIndex uses the docx2txt library to process .docx files. It's possible that there might be an issue with how DocxReader is handling the files. Here's the implementation of the DocxReader class for reference:

class DocxReader(BaseReader):
    """Docx parser."""

    def load_data(
        self, file: Path, extra_info: Optional[Dict] = None
    ) -> List[Document]:
        """Parse file."""
        try:
            import docx2txt
        except ImportError:
            raise ImportError(
                "docx2txt is required to read Microsoft Word files: "
                "`pip install docx2txt`"
            )

        text = docx2txt.process(file)
        metadata = {"file_name": file.name}
        if extra_info is not None:
            metadata.update(extra_info)

        return [Document(text=text, metadata=metadata or {})]

(Source: llama-index-legacy/llama_index/legacy/readers/file/docs_reader.py)

  1. Environment and Dependency Check: Ensure that all dependencies, especially those related to ZIP file handling and docx2txt, are correctly installed and compatible with each other and with your Python version.

If you've checked all of these and the issue persists, it would be helpful if you could provide more information about the specific .docx files you're trying to load and any additional details about your environment. This will help in further diagnosing the issue.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a πŸ‘ if this is helpful and πŸ‘Ž if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
osafaimal commented 8 months ago

so 1. i can open them in Word 2. I don't know how to Check DocxReader Implementation but i believe it should be okay otherwise many others people should have the problem 3. I don't use zip so i don't understand why it mentions zip.

logan-markewich commented 8 months ago

I think its just failing to open the format. Something about the fileformat is not compatible with docxt2txt

osafaimal commented 8 months ago

ok i will search. If you have idea about how can i troubleshoot.