cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.1k stars 3.8k forks source link

quantize: add vector quantization package #134112

Open andy-kimball opened 11 hours ago

andy-kimball commented 11 hours ago

Quantization compresses a set of full-size vectors to an equally-sized set of representative quantized vectors. Each quantized vector is a fraction of the size of the original full-size vector that it represents. While quantization loses information about the original vector, the quantized form can still be used to estimate the distance between the original vector and a user-provided query vector.

The Quantizer and QuantizedVectorSet interfaces are defined, which abstract quantization so that we can swap different algorithms in/out. While a future PR will include a "real" quantization algorithm, this PR implements the "UnQuantizer", which trivially implements the Quantizer interface by storing the vectors as-is.

Epic: CRDB-42943

Release note: None

cockroach-teamcity commented 11 hours ago

This change is Reviewable