duckdb / duckdb_vss

MIT License
94 stars 10 forks source link

HNSW index keys must be of type FLOAT[N] - but they are #24

Closed marquesafonso closed 3 months ago

marquesafonso commented 3 months ago

Creating an HSNW index using the syntax in the documentation of the vss extension

 CREATE INDEX hnsw_index ON documents USING HNSW (embedding) WITH (metric = 'cosine')

is throwing an error:

duckdb.duckdb.BinderException: Binder Error: HNSW index keys must be of type FLOAT[N]

I have checked my table and embedding is of column_type FLOAT[]. Plus, I have checked for NULL values and there are none.

For further info, the table has around 240k rows and I am setting hnsw_enable_experimental_persistence = True.

Duckdb version: 1.0.0

Could you please help me out on this? I wasn't able to find a similar issue before but in case it exists please point me to it.

Thanks in advance for your help.

Maxxen commented 3 months ago

Hello! Thanks for opening this issue!

FLOAT[] is not FLOAT[N], there's a difference between normal lists and "fixed size" lists. Im not sure what dimension your embeddings have but you should be able to solve your issue by running:

ALTER TABLE documents ALTER embedding TYPE FLOAT[<the vector dimension>];
marquesafonso commented 3 months ago

Thank you very much for the quick reply. Indeed that makes sense and it fixed my issue.

Best!