awslabs / project-lakechain

:zap: Cloud-native, AI-powered, document processing pipelines on AWS.
https://awslabs.github.io/project-lakechain/
Apache License 2.0
79 stars 16 forks source link

Feature request: Add connector for FAISS #2

Closed HQarroum closed 3 weeks ago

HQarroum commented 5 months ago

Use case

We want to add support for a FAISS index for a very low-cost, non-production setup. This would be a new middleware that acts as a storage connector taking embeddings from other middlewares in a pipeline and stores them in a FAISS index in a given storage.

Solution/User Experience

It would be possible to use a S3 bucket as a mean of low-cost storage. The FAISS storage connector would be based on a Lambda compute with a reserved concurrency of 1, loading and writing back the index on the S3 bucket.

Alternative solutions

No response

HQarroum commented 5 months ago

Would it be possible to list the following items to be able to proceed with a prototype ?