OpenFn / apollo

GNU Lesser General Public License v2.1
0 stars 2 forks source link

Add embedding service for issue 111 #120

Open hanna-paasivirta opened 2 days ago

hanna-paasivirta commented 2 days ago

Short Description

Create an embedding service to access/create a embedding database and search it with an input text.

Fixes #111

Implementation Details

This module is primarily built for the Vocabulary Mapping project #109, but it can be used in any future projects that require embeddings. It leverages the LangChain library to allow for easy access to different types of embeddings and embedding storage services.

The main functions are:

Currently the module only allows Zilliz as a vector store and OpenAI Embeddings as a model (both require credentials). New options can be easily enabled.

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

You can read more details in our Responsible AI Policy

josephjclark commented 1 day ago

Thank you @hanna-paasivirta ! This is exciting. I'll take a close look and see if I have any suggestions, but I'm keen for this to be a rough first cut which we'll iterate on over the coming weeks.

We'll need to break up #111 into some smaller issues to represent next steps (eg: port the docs importer into this embedding service, investigate and setup PG vector, etc)