asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 58 forks source link

can't add more than a single row #91

Closed radekosmulski closed 10 months ago

radekosmulski commented 10 months ago

I am running into a very peculiar problem -- I cannot add more than a single row to a table... it always ends in a OperationalError: SQL logic error.

For instance, the following code:

import sqlite3
import sqlite_vss

db = sqlite3.connect(':memory:')
db.enable_load_extension(True)
sqlite_vss.load(db)
db.enable_load_extension(False)

version, = db.execute('select vss_version()').fetchone()
print(version)

cur = db.cursor()

cur.execute("""
create virtual table vss_demo using vss0(
  a(3) factory="Flat,IDMap2" metrict_type=INNER_PRODUCT
);
""")

import json

embedding = [1.2, 0.3, 100]
cur.execute("insert into vss_demo(a) values (?)", [json.dumps(embedding)])

embedding = [1.2, 0.3, 120]
cur.execute("insert into vss_demo(a) values (?)", [json.dumps(embedding)])

will throw a OperationalError: SQL logic error.

Thank you very much for your help 🙏

radekosmulski commented 10 months ago

Interestingly, when I try reading from the table, I get the following error (and this is despite AFAICT I have followed the docs exactly)

image

asg017 commented 10 months ago

Can you try also inserting a rowid into the vss table? Something like

cur.execute("insert into vss_demo(rowid, a) values (?, ?)", [1, json.dumps(embedding)])

I think there's a bug when you insert a new row without a rowid. Will look into it

asg017 commented 10 months ago

(will also update the docs, which incorrectly omits the rowid)

radekosmulski commented 10 months ago

Thank you very much for your prompt reply, really appreciate it!!! 🙂

This does indeed fix the problem 🙏🥳

Unfortunately, I cannot overwrite a row, but that is not a big issue for my use case: image

asg017 commented 10 months ago

To overwrite a row, you'll currently have to DELETE FROM vss_demo WHERE rowid = 4, then insert a new row. I'll be adding UPDATE support soon.

radekosmulski commented 10 months ago

Thank you so much for your help, @asg017! 🙏 Really appreciate your prompt replies and guidance!

FrancoisThoraval commented 6 months ago

Hi, I have the same issue. What is the current best practice to insert a new row when it's not possible to know the next rowid to insert ?

I could work around something like querying the max rowid before each insertion, but doing this i'm unsure it will be safe with concurrency if another session is inserting while i'm getting the max rowid. Locking the table does not sound very practical either.