Vectors are represented as binary blobs with vector components encoded in little-endian format according to IEEE-754 standard
LibSQL now supports f32/f64 vectors of any dimension not more than 65536 (see MAX_VECTOR_SZ)
Zero-length vectors are supported - and they are always represented as zero size BLOB (not NULL)
LibSQL can extended vector encoding on disk with 1 trailing byte which will encode vector type information (for now: 1 = F32, 2 = F64).
[F3 76 86 C4] [BC 9D F1 3F] [01] - this is f32 vector because type equals to 0x01 = 1 (value = [-1075.72,1.88763])
[F3 76 86 C4] [BC 9D F1 3F] - this is f32 vector because default type is 1 (value = [-1075.72,1.88763])
[F3 76 86 C4 BC 9D F1 3F] [02] - this is f64 vector because type equals to 0x02 = 2 (value = [1.10101])
Changes
5 new builtin functions are added:
vector/vector32/vector64 - convert TEXT or BLOB to binary vector BLOB. If text is provided - then vector/vector32 will produce F32 vector, but vector64 - F64 vector
vector_extract - convert binary vector BLOB to the human readable TEXT string (like [1,2,3])
vector_distance_cos - calculates cosine distance (not similarity) between vectors of same dimension and same type
SQLITE_OMIT_VECTOR preprocessor directive and autoconf parameter --disable-vector which will remove any signs of vector functions from the final build
Context
This PR adds native support for vector functions in the LibSQL.
This PR were extracted from the branch https://github.com/tursodatabase/libsql/pull/1402/ in order to make review more granular and simpler
LibSQL
now supports f32/f64 vectors of any dimension not more than 65536 (seeMAX_VECTOR_SZ
)LibSQL
can extended vector encoding on disk with 1 trailing byte which will encode vector type information (for now:1 = F32, 2 = F64
).Changes
vector
/vector32
/vector64
- convertTEXT
orBLOB
to binary vectorBLOB
. If text is provided - thenvector
/vector32
will produceF32
vector, butvector64
-F64
vectorvector_extract
- convert binary vectorBLOB
to the human readableTEXT
string (like[1,2,3]
)vector_distance_cos
- calculates cosine distance (not similarity) between vectors of same dimension and same typeSQLITE_OMIT_VECTOR
preprocessor directive andautoconf
parameter--disable-vector
which will remove any signs of vector functions from the final buildlibsql_vector.test
- TCL based test suite