asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 58 forks source link

Using Node & OpenAI Embedding Question #85

Closed metaskills closed 11 months ago

metaskills commented 11 months ago

The openai.createEmbedding returns an Array object for the embedding. How do I insert those into my database to later use with vss0? Convert to bytes? Also made a request on their project but maybe the question is more specific to here?

metaskills commented 11 months ago

Thinking JSON.stringify is the answer based on this?

asg017 commented 11 months ago

Yes JSON.stringify() will probaby be the easiest way.

I have a WIP docs site here, with a section about working with vectors in Node, but they're still incomplete https://alexgarcia.xyz/sqlite-vss/nodejs.html

metaskills commented 11 months ago

Thanks! I'll close this out. For me it was JSON.stringify(response.data.data[0].embedding)

Do you have any advice on how to query? I'm there now and working thru it...

asg017 commented 11 months ago

The SQL should be similar to something like this:

select rowid, distance
from vss_articles
where vss_search(headline_embedding, ?)
limit 100;

Where ? is a parameter you bind that's either a JSON string of a vector, or the "raw bytes" representation of a vector (using Float32Array like in those docs).

That'll just get you the rowids and distances of the 100 nearest vectors to your query vector, you'll probably want to JOIN that query against other tables in your DB. If you want the raw values of the original vectors, you can also return them, which should be vectors in "raw bytes" as of v0.1.1