Second branch in the series for DiskANN implementation. This PR introduce few utility functions and classes which aims to simplify interop between SQLite & DiskANN implementation
VectorIdxParams - binary container of vector index params. Instead of storing parameters in the rigid schema we decided to use binary format over which we have full control (so we can add parameters dynamically without complicated schema changes and control default values easily)
The format by itself is very simply: every value in container spans 9 bytes when first byte interpreted as a tag and 8 next bytes interpreted as an u64 integer or f64 float
VectorInRow - container which stores input row for INSERT/DELETE operations
VectorOutRows - container which stores output rows for SEARCH operation
For example, this is correct index creation statement (in follow up branches):
CREATE INDEX t_idx ON t (
libsql_vector_idx(emb, 'type=diskann', 'metric=cosine', 'alpha=1.2', 'search_l=70', 'insert_l=120')
);
But also we will provide reasonable defaults and user still can simply write:
CREATE INDEX t_idx ON t ( libsql_vector_idx(emb ) );
Testing
Simple test cases for VectorIdxParams added in test_libsql_diskann.c file
Other classes are pretty hard to test in isolation - so I decided to leave them without unit tests (but integration tests are added in the following branches)
Context
Second branch in the series for DiskANN implementation. This PR introduce few utility functions and classes which aims to simplify interop between SQLite & DiskANN implementation
VectorIdxParams
- binary container of vector index params. Instead of storing parameters in the rigid schema we decided to use binary format over which we have full control (so we can add parameters dynamically without complicated schema changes and control default values easily)VectorInRow
- container which stores input row forINSERT
/DELETE
operationsVectorOutRows
- container which stores output rows forSEARCH
operationTable of supported index parameters for now:
For example, this is correct index creation statement (in follow up branches):
But also we will provide reasonable defaults and user still can simply write:
Testing
VectorIdxParams
added intest_libsql_diskann.c
file