This PR implement full set of operations for 1bit and f64 vector data types. This includes:
Ability to create ANN index: see vector-f64-index and vector-1bit-index tests
Set of vector function with basic operation support: vector_extract / vector64 / vector1bit
Also, this PR changes the behaviour of vectorTT functions - before they returned binary blob as is which is weird (for example, vector64(vector32('[1]')) returned binary blob corresponding to vector32. This PR changes this behaviour and implement proper conversion between vector types - so now vectorTT always returns vector of the specified type
Changes
Introduce FLOAT1BIT / F1BIT_BLOB types for 1bit bit-vectors
Allow ANN index creation for any valid column type (f32 / f64 / 1bit so far)
Implement vector1bit conversion function + change the behaviour of all other conversion functions
Change compress_neighbors setting parameter from 1bit to float1bit to be consistent with type names
This is potentially breaking change - but since we introduced support of compress_neighbors only recently and didn't published sqld version with its support - I think we are safe to change this now
Self-sufficient on-disk format implemented for 1bit vectors. It looks like this:
[data[0] as u8] [data[1] as u8] ... [data[(dims + 7) / 8] as u8] [_ as u8; padding]? [leftover as u8] [3 as u8]
every data byte (except for the last) represents exactly 8 components of the vector
last data byte represents [1..8] components of the vector
optional padding byte ensures that leftover byte will be written at the odd blob position (0-based)
leftover byte specify amount of trailing bits in the blob without last 'type'-byte which must be omitted
(so, vector dimensions are equal to 8 * (blob_size - 1) - leftover)
last 'type'-byte is mandatory for float1bit vectors
Context
This PR implement full set of operations for 1bit and f64 vector data types. This includes:
vector-f64-index
andvector-1bit-index
testsvector_extract
/vector64
/vector1bit
Also, this PR changes the behaviour of
vectorTT
functions - before they returned binary blob as is which is weird (for example,vector64(vector32('[1]'))
returned binary blob corresponding tovector32
. This PR changes this behaviour and implement proper conversion between vector types - so nowvectorTT
always returns vector of the specified typeChanges
FLOAT1BIT
/F1BIT_BLOB
types for1bit
bit-vectorsf32
/f64
/1bit
so far)vector1bit
conversion function + change the behaviour of all other conversion functionscompress_neighbors
setting parameter from1bit
tofloat1bit
to be consistent with type namescompress_neighbors
only recently and didn't published sqld version with its support - I think we are safe to change this now