katanaml / sparrow

Data processing with ML, LLM and Vision LLM
https://katanaml.io
GNU General Public License v3.0
3.73k stars 379 forks source link

Error: `llama-index-readers-file` package not found despite being installed #64

Closed albertgilopez closed 3 months ago

albertgilopez commented 3 months ago

I'm encountering an error when trying to use the SimpleDirectoryReader from LlamaIndex in my project. The error message states that the llama-index-readers-file package is not found, even though it's installed.

Environment

Project Structure

sparrow-ml/ ├── llm/ │ ├── rag/ │ │ └── agents/ │ │ └── llamaindex/ │ │ ├── llamaindex.py │ │ ├── vllamaindex.py │ │ └── vprocessor.py │ ├── api.py │ ├── ingest.py │ └── requirements_llamaindex.txt

Steps to Reproduce

  1. Create .env_llamaindex environment

  2. Install LlamaIndex and its dependencies:

    pip install -r sparrow-ml/llm/requirements_llamaindex.txt
  3. Run the FastAPI application:

    python sparrow-ml/llm/api.py
  4. Attempt to perform document ingestion through the API endpoint.

Error Message

unning ingest with llamaindex ⠋ Connecting to Weaviate... [Weaviate connection logs omitted for brevity] ⠼ Connecting to Weaviate... ⠋ Loading documents... ERROR:root:Unexpected error occurred: llama-index-readers-file package not found ERROR:api:ValueError in ingest: Unexpected error: llama-index-readers-file package not found INFO: 127.0.0.1:51045 - "POST /api/v1/sparrow-llm/ingest HTTP/1.1" 400 Bad Request

Additional Context

Questions

  1. Is there a known issue with llama-index-readers-file in version 0.10.23?
  2. Are there any additional steps or configurations needed to properly use SimpleDirectoryReader in this version, especially considering our project structure?
  3. Could this be related to how the package is being imported or used in our ingest.py file?
  4. Are there any specific changes needed in our llamaindex.py or vllamaindex.py files to accommodate the new structure in LlamaIndex v0.10?

Any assistance or guidance on resolving this issue would be greatly appreciated. Thank you for your time and support.

abaranovskis-redsamurai commented 3 months ago

Hi, code is tested on macOS platform and all works fine with currently listed versions. You are trying to run it through API, try to run directly through CLI and check if same issues appears.

raininja commented 3 months ago

I have this same problem, I was able to hack around it by commenting out the PandasExeclReader references in llama_index/core/readers/file/base.py. Evidently there is an upstream issue in llama_index causing this https://github.com/run-llama/llama_index/discussions/11604 .