pathwaycom / pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
https://pathway.com
Other
2.84k stars 98 forks source link

Cannot import name 'default_vector_document_index' from 'pathway.stdlib.indexing' (/usr/local/lib/python3.10/dist-packages/pathway/stdlib/indexing/__init__.py) #49

Closed diractions closed 1 month ago

diractions commented 1 month ago

What is your question or problem? Please describe. I’m trying to reproduce the example “Private RAG with Connected Data Sources using Mistral, Ollama, and Pathway from April 23, 2024. I’ve ran the example in Colab and also on my local environment. In both cases, at the Data Loading section where the necessary modules are imported, this error throws: Cannot import name 'default_vector_document_index' from 'pathway.stdlib.indexing' (/usr/local/lib/python3.10/dist-packages/pathway/stdlib/indexing/init.py)

Describe what you would like to happen I request help to resolve this issue.

dxtrous commented 1 month ago

Thanks @diractions, [edit] we are investigating the cause.

berkecanrizai commented 1 month ago

Hey, the notebook was changed due to an update in the (yet unreleased) indexing package, you can replace it with:

from pathway.stdlib.indexing import VectorDocumentIndex

# later at the index creation
index = VectorDocumentIndex(
    documents.doc, documents, embedder, n_dimensions=embedding_dimension)

That index module will be working in the next release 🚀

dxtrous commented 1 month ago

@diractions Would you mind reinstalling/upgrading Pathway (pip install -U pathway, giving version 0.11.0 from May 10), and confirming that it works with the original notebook as intended?

If so, we can close this issue.

I will also open an internal related issue to make sure the process of updating notebooks is better synchronized with releases.

diractions commented 1 month ago

Upgraded to 0.11.0. The previously reported import errors are not thrown anymore. So you can close this issue. Thanks!