tursodatabase / libsql

libSQL is a fork of SQLite that is both Open Source, and Open Contributions.
https://turso.tech/libsql
MIT License
11.26k stars 286 forks source link

Vector aux functions #1557

Closed sivukhin closed 4 months ago

sivukhin commented 4 months ago

Context

Second branch in the series for DiskANN implementation. This PR introduce few utility functions and classes which aims to simplify interop between SQLite & DiskANN implementation

Table of supported index parameters for now:

static struct VectorParamName VECTOR_PARAM_NAMES[] = { 
  { "type",     VECTOR_INDEX_TYPE_PARAM_ID,    0 /* string */, "diskann", VECTOR_INDEX_TYPE_DISKANN },
  { "metric",   VECTOR_METRIC_TYPE_PARAM_ID,   0 /* string */, "cosine", VECTOR_METRIC_TYPE_COS },
  { "alpha",    VECTOR_PRUNING_ALPHA_PARAM_ID, 2 /* float */,   0, 0 },
  { "search_l", VECTOR_SEARCH_L_PARAM_ID,      1 /* integer */, 0, 0 },
  { "insert_l", VECTOR_INSERT_L_PARAM_ID,      2 /* integer */, 0, 0 },
};

For example, this is correct index creation statement (in follow up branches):

CREATE INDEX t_idx ON t ( 
    libsql_vector_idx(emb, 'type=diskann', 'metric=cosine', 'alpha=1.2', 'search_l=70', 'insert_l=120') 
);

But also we will provide reasonable defaults and user still can simply write:

CREATE INDEX t_idx ON t ( libsql_vector_idx(emb ) );

Testing