TogetherCrew / airflow-dags

1 stars 1 forks source link

BUG: sqlalchemy conflict between airflow and llama-index #151

Closed amindadgar closed 5 months ago

amindadgar commented 5 months ago

As we're updating the llama-index library version to use their newest features (pipelines, docstore, etc), we're hitting an error that is

Error!!!: Too old airflow version.

This error is being raised because docker cannot run the gosu command to get the airflow version therefore it is raising error. Looking at the logs it seems it is raising because of the sqlalchemy version of apache airflow should be <=1.4.49 and the version for us to use the newest llama-index is greater than 2.0. In this case airflow service cannot come up and is raising this error.

To resolve this error we need to migrate to another vector database that is not very dependent on sqlalchemy version.

Researching about it, we found out that our best alternative is Qdrant database which supports async + metadata filtering (ref: QDrant features

To update our systems to use the Qdrant database we have the following tasks

Note *: IDs should be the same across multiple runs. This is because the docstore could check for duplicated or updated nodes.

amindadgar commented 5 months ago

For now, we'll be keeping the old codes to use the pgvector and slowly we'll migrate from pgvector to qdrant.