pathwaycom / llm-app

Dynamic RAG for enterprise. Ready to run with Docker,⚡in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
https://pathway.com/developers/templates/
MIT License
3.36k stars 192 forks source link

Table has no column with name doc #52

Open Boburmirzo opened 10 months ago

Boburmirzo commented 10 months ago

Currently, there is no way to send the data to the indexing process without creating a doc column from the input.

Need to fix the indexing error:

AttributeError: Table has no column with name doc.
Occurred here:
    Line: query_context = index.query(embedded_query, k=3).select(
    File: /home/bumurzokov/llm-app/src/prompt.py:14

When no doc column defined, it always fails at the index stage:

# Compute embeddings for each document using the OpenAI Embeddings API
embedded_data = contextful(context=documents, data_to_embed=documents.doc)
mdmalhou commented 10 months ago

The index expects a column doc containing the chunks and data having their corresponding embeddings. Open for any suggestions or alternatives.

Boburmirzo commented 10 months ago

@mdmalhou Thank you for replying to this.

If it is a technical requirement having always doc column for indexing, my suggestion can be somehow abstract this step in the library so that user can specify what fields to index and the LLM App automatically creates doc column under the hood for chosen fields.

If user does not specify any field to index, LLM App creates doc column with all fields for indexing.

The same we have already discussed with @janchorowski last week.

What do you think?