Open jdb8 opened 11 months ago
Thank you for the detailed report @jdb8 !
Can you update the INSERTs to also include a unique rowid, like this:
const stmt = db.prepare('INSERT INTO vss_demo(rowid, a) VALUES (?, ?)');
const selectStmt = db.prepare('SELECT * from vss_demo;');
const embedding1 = [0.1, 0.2, 0.3];
const embedding2 = [0.2, 0.3, 0.4];
stmt.run(1, JSON.stringify(embedding1));
stmt.run(2, JSON.stringify(embedding2));
If that works, then I think that indicates a bug within sqlite-vss
. When you insert into a vss0
table without providing a rowid, it should come up with a new valid rowid but that part may not be working correctly
Thanks for the quick response!
Just tried that out, and it does seem to fix the problem! I'm no longer getting the SQL logic error
when I manually specify a rowid. Funnily enough, I had tried something like this before, but I think I incorrectly tried to update the create virtual table
call to pass in the rowid there rather than treating the virtual table as having an implicit generated rowid column.
So that's great. In terms of the output though, now that I'm re-reading https://alexgarcia.xyz/sqlite-vss/getting-started.html#introduction, is it actually expected that select *
would not return the JSON embeddings themselves? I think I was assuming that { a: null }
implied the rows hadn't been inserted, but now that I make a real query with a where-clause, I'm at least able to extract distance
:
const selectStmt = db.prepare('SELECT rowid, distance from vss_demo where vss_search(a, ?) limit 5;');
console.log(selectStmt.all(JSON.stringify([0.1, 0.2, 0.4])));
->
❯ node index.mjs
v0.1.1-alpha.20
[
{ rowid: 1, distance: 0.009999998845160007 },
{ rowid: 2, distance: 0.020000003278255463 }
]
Is there any way to return the original embeddings for each of those rows? Even though in reality I'd likely just use the rowid to select from a non-virtual table anyway, but curious if this is possible for debugging (I'm a bit unfamiliar with how virtual tables work under the hood, if you couldn't tell 😅)
I'm still working through the getting started page (and all of the alexgarcia.xyz/sqlite-vss site), so apologies if you find issues there!
I'm pushing 0.1.1-alpha.21
right now, which will allow you to reference vss0
column directly and get the raw embedding value, in the "raw bytes" format. You can then decode that directly in Node with Float32Array
use vector_to_json()
to convert it to JSON.
Though keep in mind, Faiss might encode/decode your vectors and cause it to slightly change, especially if you use a custom factory string. So something like 0.09431782043791359
might return as 0.09431782958475
, but you likely notice much of a difference
ok 0.1.1-alpha.21
is out, including an npm package, please try it out and lmk if it works for you!
@asg017 awesome, works great! Thanks for releasing this so quickly.
Might be a silly question, but when I try to use Float32Array
to convert the returned embedding, I get quite different results - e.g. for the input 3-dimensional vector of [0.1, 0.2, 0.3]
, I seem to be getting back something like:
Float32Array(12) [
205, 204, 204, 61, 205,
204, 76, 62, 154, 153,
153, 62
]
This is via
const results = selectStmt.all();
for (const { embedding } of results) {
const floatArray = new Float32Array(embedding);
console.log({ floatArray });
}
I also tried setting the last argument of Float32Array
to 3 (as the length?) but that gave me the same result. I'm definitely doing something wrong here, but not sure what :(
Also, I'm not sure what the syntax for using vector_to_json()
would be - I tried a naive select vector_to_json(*) from vss_demo
but that gave me SqliteError: wrong number of arguments to function vector_to_json()
.
EDIT: ah, right after posting this, I realised I was missing .buffer
on the embedding
value - works now!
It would be select vector_to_json(a) from vss_demo
, which would return the JSON array of the a
embedding, as a string so you'll have to JSON.parse
it
Hi there! I'm trying to get something simple running using
better-sqlite3
by following the instructions here: https://alexgarcia.xyz/sqlite-vss/nodejs.htmlMy code looks as follows:
The
select vss_version()
call works, printingv0.1.1-alpha.20
, which tells me that the extension is successfully installed and thatbetter-sqlite3
is working.However, when trying to insert two embeddings, on the second I get the following:
If I remove the second insertion call, and insert only one entry, I get no error but the
select
query seems to return nothing (which implies that the first insertion didn't work):I'm not sure if this is me doing something wrong - unfortunately the node docs only contain examples for insertions and not table creation or querying - but I've tried to also read https://alexgarcia.xyz/sqlite-vss/getting-started.html#introduction too and those examples give me a similar inscrutible "SQL logic error". I also can't spot any node testcases in this repo from which to copy from.
I've created a repro of this at https://github.com/jdb8/sqlite-vss-testing if that's helpful. Please let me know if you're unable to repro or have more questions!