Open klavinski opened 1 year ago
Would like to see it too, but will be hard. Any SQLite extension in WASM is complicated, as you saw in sqlite-lines. Additionally, since sqlite-vss
relies on Faiss, I'd imagine there's even more hurdles we'd have to jump through, in addition to this being written in C++ (which probably isn't too big of an issue, but foriegn to me as I've only successfully compiled SQLite extensions to WASM with plain C extensions before).
Also, there's a bunch of different SQLite WASM targets now, each of which are slightly incompatible with each other. There's the official SQLite WASM build, sql.js, and probably a few more I don't know about.
I'm open for contributions that give it a shot, but if anyone reading this would like to give it a shot, please comment here with your approach before sending a PR. Additionally, if anyone wants to sponsor this work I'd be more than happy to talk about it, if you have a clear goal in mind!
The official WASM build makes it easier to implement an extension. In my case, I settled on adding this one. I copied the code of the extension into the file ext/wasm/extra_init.c
, then followed the official steps:
./configure --enable-all
make sqlite3.c
cd ext/wasm
make
This produces the .wasm
and .js
files with the extension enabled.
An option would be to have a version that does not depend on Faiss (separate branch?)
The HNSW algo is relatively simple, and there are some libraries like hnswlib
I copied the code of the extension into the file
ext/wasm/extra_init.c
, then followed the official steps:
Thanks for pointing out ext/wasm/extra_init.c
! Seems like building for SQLite's WASM build is much easier than sql.js, at least since the last time I tried.
It's still be difficult for sqlite-vss
however, since Faiss is such a heavy and tricky-to-compile dependency. I haven't found any examples of Faiss being compiled to WASM. But @kroggen that hnswlib library may be a solution: I originally looked at that lib when building sqlite-vss
, but chose Faiss since it had way more indexing options and flexible storage.
I don't think adding hnswlib to sqlite-vss
would be easy to do, and I'd rather sqlite-vss
stay with Faiss for now. However, I can totally see a new sqlite-hnsw
project that uses hnswlib instead, and has a similar APIsqlite-vss
but without a few bells and whistles. Plus, since it's header only, it'll probably be very easy to compile to WASM.
I don't have the capacity now to start a new sqlite-hnsw
project, but if anyone reading this wants to give it a try, would be more than happy to help!
I also looked at hora when building sqlite-vss
, which would've worked with sqlite-loadable-rs, but it seemed inactive and I couldn't find any nice APIs to serialize an index to a buffer. Also sqlite-loadable-rs
is great for simple table functions and virtual tables, but isn't great at shadow tables yet, so it would've been a lot of work to implement. Also, building a SQLite extension in Rust and compiling it to WASM is incredibly difficult (maybe impossible?)
Just found https://github.com/jiggy-ai/hnsqlite/, but they don’t have a wasm build
I did not update this issue, but for those still looking for a solution, I successfully used a combination of hnswlib, which stores the embeddings in IndexedDB, and SQLite for the rest.
I did not update this issue, but for those still looking for a solution, I successfully used a combination of hnswlib, which stores the embeddings in IndexedDB, and SQLite for the rest.
Appreciate if you could share the solution. Any public URL?
Thanks.
I found another one which look promising and timely, it just get 1.0.0 released few days ago, the SurrealDB.
It support the following features according to the docs:
After losing some hairs in the past 2 days :), I finally make the surrealdb.wasm works with indxdb with simple test of a vector function today.
This extension would be great to enable vector search in the browser. Is there a guide to add it to the WASM build? I tried studying sqlite-lines, unsuccessfully.