tursodatabase / libsql

libSQL is a fork of SQLite that is both Open Source, and Open Contributions.
https://turso.tech/libsql
MIT License
11.22k stars 286 forks source link

vector search: implement and integrate bfloat16 #1688

Closed sivukhin closed 2 months ago

sivukhin commented 2 months ago

Context

Final feature to finalize vector search for now - BFLOAT16 support

It's very similar to FLOAT16 with the difference that BFLOAT16 are easier to convert from/to FLOAT32 so operations are faster. But the speed comes in exchange to the precision as BFLOAT has only 8 bits for mantissa - which is pretty low (while FLOAT16 has 13 bits for mantissa)

I chose to use B16 suffix for new vector type in order to make it consistent with all other types (maybe BFLOAT16 were a bit more natural in isolation, but I think it's pretty nice to have all types follow same structure: FLOAT{suffix}(N))

Changes

Simple bruteforce benchmarks (will do more benchmarks later):

============ testing type 1bit ==================
inserts (bruteforce_1bit.sql):
  insert: 152.74 micros (avg.), 1000 (count)
  size  : 0.2148 MB
  reads : 1.00 (avg.), 1000 (total)
  writes: 1.00 (avg.), 1000 (total)
search (bruteforce_1bit.sql):
  select: 391.55 micros (avg.), 1000 (count)
  size  : 0.2148 MB
  reads : 2000.00 (avg.), 2000000 (total)
============ testing type 8 ==================
inserts (bruteforce_8.sql):
  insert: 153.15 micros (avg.), 1000 (count)
  size  : 1.9609 MB
  reads : 1.00 (avg.), 1000 (total)
  writes: 1.00 (avg.), 1000 (total)
search (bruteforce_8.sql):
  select: 2034.66 micros (avg.), 1000 (count)
  size  : 1.9609 MB
  reads : 2000.00 (avg.), 2000000 (total)
============ testing type 16 ==================
inserts (bruteforce_16.sql):
  insert: 162.73 micros (avg.), 1000 (count)
  size  : 3.9219 MB
  reads : 1.00 (avg.), 1000 (total)
  writes: 1.00 (avg.), 1000 (total)
search (bruteforce_16.sql):
  select: 7974.04 micros (avg.), 1000 (count)
  size  : 3.9219 MB
  reads : 2000.00 (avg.), 2000000 (total)
============ testing type b16 ==================
inserts (bruteforce_b16.sql):
  insert: 155.47 micros (avg.), 1000 (count)
  size  : 3.9219 MB
  reads : 1.00 (avg.), 1000 (total)
  writes: 1.00 (avg.), 1000 (total)
search (bruteforce_b16.sql):
  select: 6818.85 micros (avg.), 1000 (count)
  size  : 3.9219 MB
  reads : 2000.00 (avg.), 2000000 (total)
============ testing type 32 ==================
inserts (bruteforce_32.sql):
  insert: 169.72 micros (avg.), 1000 (count)
  size  : 7.8281 MB
  reads : 1.00 (avg.), 1000 (total)
  writes: 1.00 (avg.), 1000 (total)
search (bruteforce_32.sql):
  select: 4052.73 micros (avg.), 1000 (count)
  size  : 7.8281 MB
  reads : 2000.00 (avg.), 2000000 (total)
============ testing type 64 ==================
inserts (bruteforce_64.sql):
  insert: 182.28 micros (avg.), 1000 (count)
  size  : 12.2148 MB
  reads : 1.00 (avg.), 1000 (total)
  writes: 1.00 (avg.), 1000 (total)
search (bruteforce_64.sql):
  select: 4607.66 micros (avg.), 1000 (count)
  size  : 12.2148 MB
  reads : 2000.00 (avg.), 2000000 (total)