babenkoivan / elastic-migrations

Elasticsearch migrations for Laravel
MIT License
187 stars 32 forks source link

mappings for embeddings #55

Closed dermatzeimnetz closed 4 months ago

dermatzeimnetz commented 5 months ago

I want to create embeddings via Python SentenceTransformer, save it into mariaDB as text or string and then migrate these embeddings as a denseVector field into Elastic. I want to save the embeddings into the mariaDB for later use and backups.

For my local ElasticSearch free version 8.13.2 running on localhost:9200 I use:

public function up(): void
{
    Index::create('cn_se', function (Mapping $mapping, Settings $settings) {
        $mapping->text('title');
        $mapping->text('description');
        $mapping->keyword('lang');
        $mapping->keyword('year');

        $mapping->denseVector('description_embedding', [
            'dims' => 768,
            'similarity' => 'cosine'
        ]);

        $settings->index([
            'number_of_replicas' => 0
        ]);
    });
}

Now, what kind of field do I need to create and fill my embeddings in the mariaDB? Will it just convert correctly into a denseVector? In the end I want to query a Knn-Search.

babenkoivan commented 4 months ago

Hey @dermatzeimnetz, unfortunately, I can't answer this question. Closing this as it's not a bug or a feature request.