astronomer / ask-astro

An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
https://ask.astronomer.io/
Apache License 2.0
196 stars 47 forks source link

Upsert logic needs to be redesigned #125

Open mpgreg opened 12 months ago

mpgreg commented 12 months ago

Describe the bug Current code is not atomic and is not batch safe. Also no roll-back.

If a documents chunks span a batch the upsert will remove docs and reinsert in one batch and then redo it in the next batch. Documents inserted in the first batch will be deleted and no inserted.

Version

To Reproduce Steps to reproduce the behavior:

Expected behavior Upsert should be atomic, batch-safe and allow roll-back if any of the batches fail.

Screenshots

Additional context

mpgreg commented 12 months ago

This is really a bug for the pre-release weaviate provider but documenting here since it impacts ask astro.