zmedelis / bosquet

Tooling to build LLM applications: prompt templating and composition, agents, LLM memory, and other instruments for builders of AI applications.
https://zmedelis.github.io/bosquet/
Eclipse Public License 1.0
280 stars 19 forks source link

support vor vector stores ? #6

Closed behrica closed 1 year ago

behrica commented 1 year ago

I was reflecting about the minimal tooling we need to work on larger texts with LLMs. In my view we need 3 things:

  1. a LLM as such able to do 2 operations "completing " and "create embedding"
  2. splitting of texts
  3. a vector database wit a least 2 operatins (storing vectors, finding closest vectors to given vector)

1) and 2) we have in bosquet, at least minimal

I am not sure if 3) is existing in the Clojure world. I believe there are some vector database having a java binding, at least Milvus does: https://github.com/milvus-io/milvus-sdk-java/blob/master/examples/main/java/io/milvus/GeneralExample.java

Ideally bosquet would support various vector databases, maybe via an abstraction Any thoughts ?

behrica commented 1 year ago

This has a Clojure client : https://github.com/vdaas/vald-client-clj

zmedelis commented 1 year ago

I would probably go with Pinecone as the first simple implementation. It has REST API https://docs.pinecone.io/reference/list_indexes/

behrica commented 1 year ago

Ok, it has a free hosted edition, which is good enough for testing.

I will probably try to implement a little uses case I have, as depicted above.

I am not sure, if this needs any addition / change to bosquet.

zmedelis commented 1 year ago

Thanks for suggesting this and let's see what changes it will require.

But there is a more fundamental question. Does this project tries to be Langchain for Clojure (replicating the whole plethora of functionality vector stores and whatnot) or finds some specific and at least slightly different take on LLM use? Hence the pause in Bosquet development.

behrica commented 1 year ago

It is of course a good question, on which I have no answer. I think, in general, the functional approach of Clojure is asking for "combining tools" and not re-invent / duplicate complete solutions.

So far I think "my usecase" will not need changes in bosquet.

I personally think we should not replicate an existing python library, but use it via libpython-clj

behrica commented 1 year ago

I think we can close this for know. I tried some things combining text splitting , vector databases and the LLMs and it composed nicely, all just being data.