pathwaycom / llm-app

Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
https://pathway.com/developers/templates/
MIT License
4.68k stars 236 forks source link

Table has no column with name doc #52

Open Boburmirzo opened 1 year ago

Boburmirzo commented 1 year ago

Currently, there is no way to send the data to the indexing process without creating a doc column from the input.

Need to fix the indexing error:

AttributeError: Table has no column with name doc.
Occurred here:
    Line: query_context = index.query(embedded_query, k=3).select(
    File: /home/bumurzokov/llm-app/src/prompt.py:14

When no doc column defined, it always fails at the index stage:

# Compute embeddings for each document using the OpenAI Embeddings API
embedded_data = contextful(context=documents, data_to_embed=documents.doc)
mdmalhou commented 1 year ago

The index expects a column doc containing the chunks and data having their corresponding embeddings. Open for any suggestions or alternatives.

Boburmirzo commented 1 year ago

@mdmalhou Thank you for replying to this.

If it is a technical requirement having always doc column for indexing, my suggestion can be somehow abstract this step in the library so that user can specify what fields to index and the LLM App automatically creates doc column under the hood for chosen fields.

If user does not specify any field to index, LLM App creates doc column with all fields for indexing.

The same we have already discussed with @janchorowski last week.

What do you think?