0xPlaygrounds / rig

⚙️🦀 Build portable, modular & lightweight Fullstack Agents
https://rig.rs
MIT License
153 stars 9 forks source link

feat: Add utility methods to simplify InMemoryVectorStore creation #32

Closed cvauclair closed 1 month ago

cvauclair commented 1 month ago

Previously:

let model = openai_client.embedding_model("text-embedding-ada-002");

let mut vector_store = InMemoryVectorStore::default();

let embeddings = EmbeddingsBuilder::new(model.clone())
    .simple_document("doc0", "Definition of a *flurbo*: A flurbo is a green alien that lives on cold planets")
    .simple_document("doc1", "Definition of a *glarb-glarb*: A glarb-glarb is a ancient tool used by the ancestors of the inhabitants of planet Jiro to farm the land.")
    .simple_document("doc2", "Definition of a *linglingdong*: A term used by inhabitants of the far side of the moon to describe humans.")
    .build()
    .await?;

vector_store.add_documents(embeddings).await?;

let index = vector_store.index(model);

Can now be written as:

let model = openai_client.embedding_model("text-embedding-ada-002");

let embeddings = EmbeddingsBuilder::new(model.clone())
    .simple_document("doc0", "Definition of a *flurbo*: A flurbo is a green alien that lives on cold planets")
    .simple_document("doc1", "Definition of a *glarb-glarb*: A glarb-glarb is a ancient tool used by the ancestors of the inhabitants of planet Jiro to farm the land.")
    .simple_document("doc2", "Definition of a *linglingdong*: A term used by inhabitants of the far side of the moon to describe humans.")
    .build()
    .await?;

let index = InMemoryVectorIndex::from_embeddings(model, embeddings).await?;

Or even shorter if using InMemoryVectorIndex::from_documents!

cvauclair commented 1 month ago

Looks good. I also see there's an opportunity for fluid builder design on the openai_client.embedding_model(...) that could make it a bit cleaner but understandably makes the requirements on other provider clients a bit hazy.

I think we might end up with traits for Clients to simplify this integration, but that will be tackled in another issue!