asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 58 forks source link

How do we load vss_search from python? #79

Open teowave opened 11 months ago

teowave commented 11 months ago

I have this code with the search query in python:

vss_search_query = f""" SELECT rowid, filenames.filepath, Page_nr, Paragraph_nr, Sliding_Paras FROM policy_docs JOIN filenames ON policy_docs.filename_id = filenames.id WHERE vss_search(Sliding_Embedding, '{search_term_vector_sql}') LIMIT 20; """

I get the error

in search cur.execute(vss_search_query) sqlite3.OperationalError: no such function: vss_search

How do we load this vss_search function from python? I looked at the python examples and I could not find anything, the examples stop at inserting vectors, they do not go to searching afterwards.

teowave commented 11 months ago

I am now trying to run the test query from the documentation:

select rowid, distance from vss_articles where vss_search( headline_embedding, (select headline_embedding from articles where rowid = 123) ) limit 10;

I have adapted it to my table and to python as below:

import sqlite3 import sqlite_vss

debug = True

Connect to your SQLite database

db = sqlite3.connect('ch_policies_v3.db')

Enable loading of SQLite extensions

db.enable_load_extension(True)

Load the sqlite_vss module

sqlite_vss.load(db)

Disable loading of SQLite extensions

db.enable_load_extension(False)

Check the version of the sqlite_vss module

version, = db.execute('select vss_version()').fetchone() if debug: print(version)

Define a test rowid

test_rowid = 123

Define the SQLite-VSS search query

vss_search_query = f""" SELECT rowid, distance FROM vss_policy_docs WHERE vss_search( Sliding_Embedding, (SELECT Sliding_Embedding FROM policy_docs WHERE rowid = {test_rowid}) ) LIMIT 10; """

Execute the query

rows = db.execute(vss_search_query)

Print the results

for row in rows: print(row)

Close the connection

db.close()

Here I get this error:


OperationalError Traceback (most recent call last) Cell In[1], line 38 27 vss_search_query = f""" 28 SELECT rowid, distance 29 FROM vss_policy_docs (...) 34 LIMIT 10; 35 """ 37 # Execute the query ---> 38 rows = db.execute(vss_search_query) 40 # Print the results 41 for row in rows:

OperationalError: vss_search() only support vss_search_params() as a 2nd parameter for SQLite versions below 3.41.0

What does this mean? How can I workaround it?

teowave commented 11 months ago

I saw issue #25 was dealing with the same issue, and the inclusion of the LIMIT parameter fixed it. However I have the LIMIT parameter already, I still get the error.

Shall I downgrade sqlite? or how can I specify only one parameter? I am not sure what these parameters are and how they relate to my code, where are they specified?

GibreelAbdullah commented 11 months ago

Same issue.

GibreelAbdullah commented 11 months ago

From docs

Try something like this instead.

select rowid, distance
from vss_xyz
where vss_search(
  headline_embedding,
  vss_search_params(
    (select headline_embedding from xyz where rowid = 123),
    20
  )
)
GibreelAbdullah commented 11 months ago

I started getting another error now.

query = f"""
SELECT rowid,
       distance
FROM   table_faiss
WHERE  vss_search(text, vss_search_params(('Search Text'), 20))  
"""

sqlite3.OperationalError: 1st argument is not a vector.

table_faiss definition below

CREATE VIRTUAL TABLE IF NOT EXISTS table_faiss USING vss0(text(384));
GibreelAbdullah commented 11 months ago

Moved a bit ahead, but still getting error.

from this blog i needed to use the st_encode function.

So the query now became.

with matches as (SELECT rowid, distance
from
hadith_faiss
WHERE  vss_search(text, Vss_search_params({searchQuery}, 2))
)
SELECT hadith.text, matches.distance
from matches
left join hadith on hadith.rowid = matches.rowid

where searchQuery is

searchQuery = st_encode('Search Text')

and the definition of st_encode I found here

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

def st_encode(x):
  result = model.encode([x])
  return json.dumps(result.tolist()[0])

Now I am getting error while executing the query.

Traceback (most recent call last):
  File "/home/alfi/Projects/VSSTest/hadithSearch.py", line 56, in <module>
    cursor = conn.execute(query)
sqlite3.OperationalError: no such column: -0.03610825166106224, 0.10422224551439285, -0.020350800827145576, 0.06060267984867096, -0.02063363417983055, -0.0003368522156961262, 0.08490633964538574, -0.0629068985581398, 0.11395560950040817, -0.03944713622331619, 0.011840143240988255, -0.019973117858171463, ...........(hundreds of other numbers)

Entire code is below.

import json
import sqlite3
import sqlite_vss

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

def st_encode(x):
  result = model.encode([x])
  return json.dumps(result.tolist()[0])

conn = sqlite3.connect("hadithVSS.db")
conn.enable_load_extension(True)
sqlite_vss.load(conn)

cursor = conn.cursor()
searchQuery = st_encode('Search Text')

query = f"""
SELECT rowid
from
hadith_faiss
WHERE  vss_search(text, Vss_search_params({searchQuery}, 2))
"""

cursor = conn.execute(query)
data = cursor.fetchall()
print(data)
conn.commit()
conn.close()

I believe vss_search is returning a bunch of text which sqlite is not able to interpret.

7flash commented 5 months ago

@GibreelAbdullah

WHERE vss_search(text, Vss_search_params({searchQuery}, 2))

try to apply json function

WHERE vss_search(text, Vss_search_params(json({searchQuery}), 2))

xcodebuild commented 2 months ago
WHERE vss_search(text, vss_search_params(json("{searchQuery}"), 2))

works