asg017 / sqlite-vec

A vector search SQLite extension that runs anywhere!
Apache License 2.0
4.26k stars 135 forks source link

Auxiliary Columns in `vec0` virtual tables #121

Closed asg017 closed 4 days ago

asg017 commented 1 month ago

Similar to auxiliary columns in the R*Tree extension, aux columns in vec0 virtual tables could be declared like so:

create virtual table vec_articles(
  headline_embeddings float[1024],
  +headline text
);

The + prefix denotes that the column is an "auxiliary column", allowing one to attach additional data to a row in a vec0 table that is not vector data, a primary key, or a partition key.

One important note: auxiliary columns cannot be used as filters in KNN queries. In other words, they cannot appear in the WHERE clause of a KNN query like so:

create virtual table vec_articles(
  headline_embeddings float[1024],
  +headline text
);

select rowid, distance
from vec_articles
where headline_embeddings match ?
  and k = 10
  -- illegal use of auxiliary column in a KNN WHERE clause!!
  and headline like 'Breaking News: %';

This is because auxiliary columns are just meant as "side-car" columns to avoid an additional table and JOIN. Values in aux columns can be as large as they want, they are stored in a separate table so performance wouldn't degrade.

for proper "metadata filtering,", see #26 .

asg017 commented 4 days ago

Now supported: https://alexgarcia.xyz/sqlite-vec/features/vec0.html#aux