Airstrip-AI / airstrip

Open-source enterprise generative AI platform
https://airstrip.pro
Apache License 2.0
52 stars 1 forks source link

Feature Request: Vector DB Integration #3

Open mjtechguy opened 3 months ago

mjtechguy commented 3 months ago

Thanks for building this. The interface and functionality is very well done!

Do you have plans to integrate vector DBs into each "app". Like the ability to connect to PGVector, Chroma, Pinecone, etc for each app?

This would allow for pulling in a huge amount of context and providing answers based in truth with references to the source material in the vector DBs.

Thanks!

MJ

hendychua commented 3 months ago

@mjtechguy thanks for your kind words. Yes, we have plans to integrate with data sources so that users do not have to put everything in the text prompt.

May I know

  1. What are the databases that you would like to integrate with the most?
  2. What is your ideal UX for the feature? To elaborate, there are 2 parts:

a) Document indexing into the vector databases - Will you be doing this outside of Airstrip, i.e. you supply a database with content, or do you want Airstrip to provide an interface for you to upload data and index? The latter will mean less flexibility and control on how chunking is done. Note that regardless of the option, you would still need to have a database ready and supply the details (host, port, credentials, etc).

b) Using a vector database in an app - If an app is connected to a vector DB, then should every user message be sent for matching with the DB?

Cheers, Hendy

dmvieira commented 3 months ago

why not connect to an external app that already focus on data integration like mindsdb?

mjtechguy commented 3 months ago

@hendychua awesome to hear that was already on the roadmap!

To answer your questions:

  1. I tend to run everything local (even my LLMs using Ollama and vLLM) so being able to connect to PGVector or Vespa locally would be ideal. Then I could just add brining up the DB to the docker compose file.

2a. I think being able to populate inside the app and connect to Ollama or OpenAI embedding endpoints to populate the date, possibly with chunk size and overlap etc tuners in the UI. Or if a DB that already has data is connected, it can use that.

2b. Yes, I think all chats from a specific user would then use the vector DB. Maybe a toggle to use or not use the DB would make sense?

I am not sure if you have seem Dialoqbase, but it has some of the right ideas, but this UI is much cleaner and I think this could be built in a cleaner way.

Thanks!

hendychua commented 2 months ago

why not connect to an external app that already focus on data integration like mindsdb?

@dmvieira, I will check out mindsdb. Thank you!

hendychua commented 2 months ago

@mjtechguy I understand the use case much better now. Thank you!