asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 58 forks source link

Using compiled version on Windows #98

Open thewh1teagle opened 10 months ago

thewh1teagle commented 10 months ago

Hi I built sqlite-vss on Windows 11 x64 using cygwin64 I followed the instructions in #building-sqlite-vss-yourself

$ make loadable
cmake -B build; make -C build
-- Could NOT find MKL (missing: MKL_LIBRARIES)
-- Using the multi-header code from /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/json/include/
-- Configuring done
-- Generating done
-- Build files have been written to: /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/build
make[1]: Entering directory '/cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/build'
[  1%] Linking CXX shared library vector0.dll
...
[ 98%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss.dir/invlists/OnDiskInvertedLists.cpp.o
[100%] Linking CXX static library libfaiss.a
[100%] Built target faiss
make[1]: Leaving directory '/cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/build'
cp build/vector0.dll dist/debug/vector0.dll
cp build/vss0.dll dist/debug/vss0.dll

And as shown, I got vector0.dll and vss0.dll

Then I placed the dll files in the same folder of sqlite-vss/vendor/sqlite/.libs which has the compiled sqlite.exe and tried to load the extension into sqlite

C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>sqlite3.exe
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load vector0.dll
sqlite> .load vss0.dll
Error: The specified module could not be found.

sqlite>

Looks like it loads successfully vector0.dll but it fails to load vss0.dll

ldd output ```ts C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>ldd vector0.dll ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ffd2ad10000) KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL (0x7ffd2a160000) KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll (0x7ffd283c0000) msvcrt.dll => /cygdrive/c/Windows/System32/msvcrt.dll (0x7ffd2a0b0000) cygwin1.dll => /usr/bin/cygwin1.dll (0x7ffceece0000) cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x3fec80000) cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3ff870000) advapi32.dll => /cygdrive/c/Windows/System32/advapi32.dll (0x7ffd2ab00000) sechost.dll => /cygdrive/c/Windows/System32/sechost.dll (0x7ffd293e0000) RPCRT4.dll => /cygdrive/c/Windows/System32/RPCRT4.dll (0x7ffd29920000) CRYPTBASE.DLL => /cygdrive/c/Windows/SYSTEM32/CRYPTBASE.DLL (0x7ffd27860000) bcryptPrimitives.dll => /cygdrive/c/Windows/System32/bcryptPrimitives.dll (0x7ffd288c0000) C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>ldd vss0.dll ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ffd2ad10000) KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL (0x7ffd2a160000) KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll (0x7ffd283c0000) msvcrt.dll => /cygdrive/c/Windows/System32/msvcrt.dll (0x7ffd2a0b0000) cygwin1.dll => /usr/bin/cygwin1.dll (0x7ffceece0000) cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x3fec80000) cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3ff870000) cygblas-0.dll => /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs/cygblas-0.dll (0x3fe3d0000) cyggomp-1.dll => /usr/bin/cyggomp-1.dll (0x3fe310000) cyglapack-0.dll => /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs/cyglapack-0.dll (0x3f8160000) cyggfortran-5.dll => /usr/bin/cyggfortran-5.dll (0x3f8890000) cygquadmath-0.dll => /usr/bin/cygquadmath-0.dll (0x3fc150000) advapi32.dll => /cygdrive/c/Windows/System32/advapi32.dll (0x7ffd2ab00000) sechost.dll => /cygdrive/c/Windows/System32/sechost.dll (0x7ffd293e0000) RPCRT4.dll => /cygdrive/c/Windows/System32/RPCRT4.dll (0x7ffd29920000) CRYPTBASE.DLL => /cygdrive/c/Windows/SYSTEM32/CRYPTBASE.DLL (0x7ffd27860000) bcryptPrimitives.dll => /cygdrive/c/Windows/System32/bcryptPrimitives.dll (0x7ffd288c0000) ```
thewh1teagle commented 10 months ago

update

I managed to load the extension into sqlite3 cli when compiled using make loadable-release. Also I had to add cyglapack-0.dll to the same folder. I successfuly loaded the extensions in Python as well. but unfortunately when executing

db.execute("""
    CREATE VIRTUAL TABLE IF NOT EXISTS vss_post USING vss0(embeddings(3));
""")

it crash the program without any error.

asg017 commented 10 months ago

Thanks for the detailed report and updates! You're the first person to report being able to compile sqlite-vss, so I'm very interested in getting this to work.

A few questions:

  1. After loading vector0.dll, does select vector_version() return a string?
  2. After loading vss0.dll, does select vss_version() return a string?
  3. Could you run select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]') and see if that works?
thewh1teagle commented 10 months ago
  1. yes
  2. yes
  3. works

See it in action:

C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>sqlite3.exe
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .open random.db
sqlite> .load vector0.dll
sqlite> select vector_version();
v0.1.2
sqlite> .load vss0.dll
sqlite> select vector_version();
v0.1.2
sqlite> select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');
0.200000002980232
sqlite>

When running the same from Python, the program crash here without error in

select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');
asg017 commented 10 months ago

Does it throw an OperationalError, or just completely crash? Is there a segmentation fault, or any other messaging that gets logged out?

thewh1teagle commented 10 months ago

No error, it just exit from python with status 1

Code here ```py import sqlite3 conn = sqlite3.connect('random.db') conn.enable_load_extension(True) conn.load_extension('vector0.dll') conn.load_extension('vss0.dll') cur = conn.cursor() cur.execute('select vector_version();') version = cur.fetchone() print(version) # Working cur.execute("select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');") # < ---- Crash res = cur.fetchone() print(res) cur.close() conn.close() ```
asg017 commented 10 months ago

The fact that it only happens when executing functions that use Faiss's vector computations (ie fails on vss_distance_l1 and not vss_version) makes me thing that it's a dynamically library error. I'm guess that windows holds off on resolving + executing the cygblas-0.dll / cyggomp-1.dll / cyglapack-0.dll libraries until they're actually needed. It also might explain the spectacular no-error message failures - if it's a deep underlying dll error, then Python may not have a chance to catch it.

That's my guess at least - my knowledge with Windows is very limited. I'd say double check that those dll's exist and work correctly (probably with ldd ? ). I'd be curious to see if there's a sample Faiss C++ project you could compile + execute on your Windows machine, to see if it's a Faiss compilation error or a sqlite-vss specific error.

thewh1teagle commented 10 months ago

I used the same faiss submodule in your repo. Just installed the necessary tools and libraries in cygwin setup and followed your instructions to build the library

thewh1teagle commented 10 months ago

I added strace output when running the python script

Strace output ```ts User@DESKTOP-HPEE9O3 /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs $ strace /cygdrive/c/Users/User/AppData/Local/Programs/Python/Python311/python.exe main.py --- Process 17912 created --- Process 17912 loaded C:\Windows\System32\ntdll.dll at 00007ff9bb4f0000 --- Process 17912 loaded C:\Windows\System32\kernel32.dll at 00007ff9ba020000 --- Process 17912 loaded C:\Windows\System32\KernelBase.dll at 00007ff9b8cf0000 --- Process 17912 thread 19168 created --- Process 17912 thread 9276 created --- Process 17912 loaded C:\Windows\System32\ucrtbase.dll at 00007ff9b8b10000 --- Process 17912 thread 3052 created --- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\vcruntime140.dll at 00007ff9206e0000 --- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python311.dll at 00007ff915b90000 --- Process 17912 loaded C:\Windows\System32\version.dll at 00007ff9af240000 --- Process 17912 loaded C:\Windows\System32\ws2_32.dll at 00007ff9b9fa0000 --- Process 17912 loaded C:\Windows\System32\msvcrt.dll at 00007ff9b92b0000 --- Process 17912 loaded C:\Windows\System32\rpcrt4.dll at 00007ff9ba740000 --- Process 17912 loaded C:\Windows\System32\advapi32.dll at 00007ff9ba9f0000 --- Process 17912 loaded C:\Windows\System32\sechost.dll at 00007ff9b94f0000 --- Process 17912 loaded C:\Windows\System32\bcrypt.dll at 00007ff9b8210000 --- Process 17912 loaded C:\Windows\System32\bcryptprimitives.dll at 00007ff9b8a90000 --- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python3.dll at 000002db01780000 --- Process 17912 unloaded DLL at 000002db01780000 --- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python3.dll at 000002db01780000 --- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\DLLs\_sqlite3.pyd at 00007ff9b31c0000 --- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\DLLs\sqlite3.dll at 00007ff923b90000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\vector0.dll at 000000055d4d0000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggcc_s-seh-1.dll at 00000003ff870000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygwin1.dll at 00007ff921030000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygstdc++-6.dll at 00000003fec80000 0 0 [main] python (17912) ********************************************** 206 206 [main] python (17912) Program name: c:\Users\User\AppData\Local\Programs\Python\Python311\python.exe (windows pid 17912) 165 371 [main] python (17912) OS version: Windows NT-10.0 132 503 [main] python (17912) ********************************************** --- Process 17912 loaded C:\Windows\System32\cryptbase.dll at 00007ff9b8040000 3365 3868 [main] python (17912) sigprocmask: 0 = sigprocmask (0, 0x0, 0x7FF9213093B0) 533 4401 [main] python (17912) open_shared: name shared.5, shared 0x1A4000000 (wanted 0x1A4000000), h 0x1A0, m 0, created 1 216 4617 [main] python (17912) shared_info::initialize: Installation root: <┬ג┬ה> key: 167 4784 [main] python (17912) user_heap_info::init: heap base 0xA00000000, heap top 0xA00000000, heap size 0x20000000 (536870912) 169 4953 [main] python (17912) open_shared: name S-1-5-21-567552140-2017299312-2275771347-1001.1, shared 0x1A4010000 (wanted 0x1A4010000), h 0x1A4, m 1, created 1 158 5111 [main] python (17912) user_info::create: opening user shared for 'S-1-5-21-567552140-2017299312-2275771347-1001' at 0x1A4010000 171 5282 [main] python (17912) user_info::create: user shared version 0 175 5457 [main] python (17912) dll_crt0_0: finished dll_crt0_0 initialization 202 5659 [main] python (17912) time: 1693222268 = time(0x0) --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\vss0.dll at 00000005ca2f0000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygblas-0.dll at 00000003f8ca0000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggomp-1.dll at 00000003fe310000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyglapack-0.dll at 00000003f8160000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggfortran-5.dll at 00000003f8890000 --- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygquadmath-0.dll at 00000003fc150000 ('v0.1.2',) 28871 34530 [main] python (17912) mmap: addr 0x0, len 34319826944, prot 0x3, flags 0x22, fd -1, off 0x0 233972 268502 [main] python (17912) mmap: 0x6FF802610000 = mmap() --- Process 17912, exception c0000005 at 00007ff921031026 --- Process 17912 thread 19168 exited with status 0xc0000005 --- Process 17912 thread 9276 exited with status 0xc0000005 --- Process 17912 thread 3052 exited with status 0xc0000005 --- Process 17912 exited with status 0xc0000005 Segmentation fault ```

Also, when I run the script using Python version of cygwin64 it works:

cygwin64 python output ```ts $ file /usr/bin/python3.9.exe /usr/bin/python3.9.exe: PE32+ executable (console) x86-64, for MS Windows, 11 sections $ /usr/bin/python3.9.exe main.py ('v0.1.2',) (0.20000000298023224,) ```
thewh1teagle commented 10 months ago

I tried faiss sample faiss c++ project and it compiled and run without errors:

code ```js #include #include #include #include #include #include #include #include #include double elapsed() { struct timeval tv; gettimeofday(&tv, nullptr); return tv.tv_sec + tv.tv_usec * 1e-6; } int main() { double t0 = elapsed(); // dimension of the vectors to index int d = 128; // size of the database we plan to index size_t nb = 1000 * 1000; // make a set of nt training vectors in the unit cube // (could be the database) size_t nt = 100 * 1000; //--------------------------------------------------------------- // Define the core quantizer // We choose a multiple inverted index for faster training with less data // and because it usually offers best accuracy/speed trade-offs // // We here assume that its lifespan of this coarse quantizer will cover the // lifespan of the inverted-file quantizer IndexIVFFlat below // With dynamic allocation, one may give the responsability to free the // quantizer to the inverted-file index (with attribute do_delete_quantizer) // // Note: a regular clustering algorithm would be defined as: // faiss::IndexFlatL2 coarse_quantizer (d); // // Use nhash=2 subquantizers used to define the product coarse quantizer // Number of bits: we will have 2^nbits_coarse centroids per subquantizer // meaning (2^12)^nhash distinct inverted lists size_t nhash = 2; size_t nbits_subq = int(log2(nb + 1) / 2); // good choice in general size_t ncentroids = 1 << (nhash * nbits_subq); // total # of centroids faiss::MultiIndexQuantizer coarse_quantizer(d, nhash, nbits_subq); printf("IMI (%ld,%ld): %ld virtual centroids (target: %ld base vectors)", nhash, nbits_subq, ncentroids, nb); // the coarse quantizer should not be dealloced before the index // 4 = nb of bytes per code (d must be a multiple of this) // 8 = nb of bits per sub-code (almost always 8) faiss::MetricType metric = faiss::METRIC_L2; // can be METRIC_INNER_PRODUCT faiss::IndexIVFFlat index(&coarse_quantizer, d, ncentroids, metric); index.quantizer_trains_alone = true; // define the number of probes. 2048 is for high-dim, overkilled in practice // Use 4-1024 depending on the trade-off speed accuracy that you want index.nprobe = 2048; std::mt19937 rng; std::uniform_real_distribution<> distrib; { // training printf("[%.3f s] Generating %ld vectors in %dD for training\n", elapsed() - t0, nt, d); std::vector trainvecs(nt * d); for (size_t i = 0; i < nt * d; i++) { trainvecs[i] = distrib(rng); } printf("[%.3f s] Training the index\n", elapsed() - t0); index.verbose = true; index.train(nt, trainvecs.data()); } size_t nq; std::vector queries; { // populating the database printf("[%.3f s] Building a dataset of %ld vectors to index\n", elapsed() - t0, nb); std::vector database(nb * d); for (size_t i = 0; i < nb * d; i++) { database[i] = distrib(rng); } printf("[%.3f s] Adding the vectors to the index\n", elapsed() - t0); index.add(nb, database.data()); // remember a few elements from the database as queries int i0 = 1234; int i1 = 1244; nq = i1 - i0; queries.resize(nq * d); for (int i = i0; i < i1; i++) { for (int j = 0; j < d; j++) { queries[(i - i0) * d + j] = database[i * d + j]; } } } { // searching the database int k = 5; printf("[%.3f s] Searching the %d nearest neighbors " "of %ld vectors in the index\n", elapsed() - t0, k, nq); std::vector nns(k * nq); std::vector dis(k * nq); index.search(nq, queries.data(), k, dis.data(), nns.data()); printf("[%.3f s] Query results (vector ids, then distances):\n", elapsed() - t0); for (int i = 0; i < nq; i++) { printf("query %2d: ", i); for (int j = 0; j < k; j++) { printf("%7ld ", nns[j + i * k]); } printf("\n dis: "); for (int j = 0; j < k; j++) { printf("%7g ", dis[j + i * k]); } printf("\n"); } } return 0; } ```
Compile output ```ts $ g++ main.cpp -I./sqlite-vss/vendor/faiss ./sqlite-vss/build_release/vendor/faiss/faiss/libfaiss.a -fopenmp -lblas -llapack $ ls a.exe main.cpp sqlite-vss ```
Program output ```ts $ ./a.exe IMI (2,9): 262144 virtual centroids (target: 1000000 base vectors)[0.005 s] Generating 100000 vectors in 128D for training [0.648 s] Training the index Training level-1 quantizer IVF quantizer trains alone... Training IVF residual IndexIVF: no residual training [8.816 s] Building a dataset of 1000000 vectors to index [15.241 s] Adding the vectors to the index MultiIndexQuantizer::search: 0:32768 / 1000000 MultiIndexQuantizer::search: 32768:65536 / 1000000 MultiIndexQuantizer::search: 65536:98304 / 1000000 MultiIndexQuantizer::search: 98304:131072 / 1000000 MultiIndexQuantizer::search: 131072:163840 / 1000000 MultiIndexQuantizer::search: 163840:196608 / 1000000 MultiIndexQuantizer::search: 196608:229376 / 1000000 MultiIndexQuantizer::search: 229376:262144 / 1000000 MultiIndexQuantizer::search: 262144:294912 / 1000000 MultiIndexQuantizer::search: 294912:327680 / 1000000 MultiIndexQuantizer::search: 327680:360448 / 1000000 MultiIndexQuantizer::search: 360448:393216 / 1000000 MultiIndexQuantizer::search: 393216:425984 / 1000000 MultiIndexQuantizer::search: 425984:458752 / 1000000 MultiIndexQuantizer::search: 458752:491520 / 1000000 MultiIndexQuantizer::search: 491520:524288 / 1000000 MultiIndexQuantizer::search: 524288:557056 / 1000000 MultiIndexQuantizer::search: 557056:589824 / 1000000 MultiIndexQuantizer::search: 589824:622592 / 1000000 MultiIndexQuantizer::search: 622592:655360 / 1000000 MultiIndexQuantizer::search: 655360:688128 / 1000000 MultiIndexQuantizer::search: 688128:720896 / 1000000 MultiIndexQuantizer::search: 720896:753664 / 1000000 MultiIndexQuantizer::search: 753664:786432 / 1000000 MultiIndexQuantizer::search: 786432:819200 / 1000000 MultiIndexQuantizer::search: 819200:851968 / 1000000 MultiIndexQuantizer::search: 851968:884736 / 1000000 MultiIndexQuantizer::search: 884736:917504 / 1000000 MultiIndexQuantizer::search: 917504:950272 / 1000000 MultiIndexQuantizer::search: 950272:983040 / 1000000 MultiIndexQuantizer::search: 983040:1000000 / 1000000 IndexIVFFlat::add_core: added 1000000 / 1000000 vectors [20.667 s] Searching the 5 nearest neighbors of 10 vectors in the index [20.684 s] Query results (vector ids, then distances): query 0: 1234 65776 815632 518751 168411 dis: 0 13.2041 13.7313 13.9331 13.9852 query 1: 1235 235209 32981 339156 485140 dis: 0 12.5675 13.2757 13.3526 13.3626 query 2: 1236 46384 393794 279123 803578 dis: 0 13.2079 13.337 13.5685 13.5999 query 3: 1237 172600 435871 490284 116815 dis: 0 12.9845 13.4125 13.4894 13.5741 query 4: 1238 185348 630264 685103 672356 dis: 0 11.3711 12.2562 12.2871 12.2897 query 5: 1239 820990 306204 3096 549432 dis: 0 12.4804 13.2535 13.5721 13.5853 query 6: 1240 122701 687644 802575 350632 dis: 0 13.7758 13.9611 14.1327 14.2155 query 7: 1241 985126 686744 336958 926803 dis: 0 13.2923 13.636 13.7428 14.0614 query 8: 1242 880999 488401 181311 712631 dis: 0 13.1505 13.4343 13.488 13.6331 query 9: 1243 829029 233144 108428 402759 dis: 0 12.5892 12.7777 12.8653 13.1447 $ echo $? 0 ```
asg017 commented 10 months ago

So to summarize, on your windows machine using cygwin64:

Is that right?

thewh1teagle commented 10 months ago

Everything correct, except for the last one - using sqlite-vss from sqlite3 CLI works, both from cygwin environemnt or just from cmd

asg017 commented 10 months ago

Cool - so is there anything actionable you'd like from this issue then? My guess is that since sqlite-vss was built with cygwin64, it requires cygwin64 applications to load the extension.

I'll probably try compiling sqlite-vss with cygwin64 on a github actions runner, but its been very difficult in the past

thewh1teagle commented 10 months ago

Currenly I want to figure out why do I get segfault when running from Python / Nodejs that is not part of cygwin I'm not sure how to debug it.

It will not be usable if we can't use it with regular Python which is not of cygwin

thewh1teagle commented 6 months ago

I managed to compile faiss on Windows in msys2 environment. msys2 works pretty well, I think it's suitable for doing it pretty easily in Github actions as well. https://github.com/facebookresearch/faiss/issues/3067

leonsmiers commented 5 months ago

Hello, I try to use VSS in combination with SQLite on Windows.I like the approach making VSS part of the query search. Did you make any progress with the Windows install? I'm struggling now with settings in the Makefile and the CMakelist.txt.

Thanks in advance, Léon

ma-chengyuan commented 3 months ago

Here's a way that worked for me, based on @thewh1teagle 's approach. I haven't tested it in-depth, but the sqlite3 cli can load the extensions and select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]') works.

  1. Install MSYS2
  2. Open UCRT terminal
  3. Run
    
    # see https://github.com/facebookresearch/faiss/issues/3067#issuecomment-1873007384
    pacman --needed -S $MINGW_PACKAGE_PREFIX-{toolchain,cmake,make,swig,autotools,lapack} git
    git clone https://github.com/asg017/sqlite-vss.git && cd sqlite-vss
    # see https://github.com/asg017/sqlite-vss/blob/main/docs.md
    ./vendor/get_sqlite.sh
    cd vendor/sqlite
    ./configure && make
    cd ../../
4. Apply the following patch to `vendor/faiss`:
```diff
diff --git a/faiss/CMakeLists.txt b/faiss/CMakeLists.txt
index 16eb9e9c..940ba03f 100644
--- a/faiss/CMakeLists.txt
+++ b/faiss/CMakeLists.txt
@@ -214,8 +214,8 @@ add_library(faiss_avx2 ${FAISS_SRC})
 if(NOT FAISS_OPT_LEVEL STREQUAL "avx2")
   set_target_properties(faiss_avx2 PROPERTIES EXCLUDE_FROM_ALL TRUE)
 endif()
-if(NOT WIN32)
-  target_compile_options(faiss_avx2 PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-mavx2 -mfma -mf16c -mpopcnt>)
+if(NOT MSVC)
+  target_compile_options(faiss_avx2 PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-mavx2 -mfma -mf16c -mpopcnt -fpermissive>)
 else()
   # MSVC enables FMA with /arch:AVX2; no separate flags for F16C, POPCNT
   # Ref. FMA (under /arch:AVX2): https://docs.microsoft.com/en-us/cpp/build/reference/arch-x64
diff --git a/faiss/impl/platform_macros.h b/faiss/impl/platform_macros.h
index 9cec8260..44293e3e 100644
--- a/faiss/impl/platform_macros.h
+++ b/faiss/impl/platform_macros.h
@@ -83,6 +83,17 @@ inline int __builtin_clzll(uint64_t x) {
 #endif

 #else
+
+/*******************************************************
+ * Windows MinGW
+ *******************************************************/
+#ifdef _WIN32
+
+#define posix_memalign(p, a, s) \
+    (((*(p)) = _aligned_malloc((s), (a))), *(p) ? 0 : errno)
+#endif
+
+
 /*******************************************************
  * Linux and OSX
  *******************************************************/
diff --git a/faiss/invlists/InvertedListsIOHook.cpp b/faiss/invlists/InvertedListsIOHook.cpp
index 0081c4f9..2c3a6006 100644
--- a/faiss/invlists/InvertedListsIOHook.cpp
+++ b/faiss/invlists/InvertedListsIOHook.cpp
@@ -13,9 +13,9 @@

 #include <faiss/invlists/BlockInvertedLists.h>

-#ifndef _MSC_VER
+#ifndef _WIN32
 #include <faiss/invlists/OnDiskInvertedLists.h>
-#endif // !_MSC_VER
+#endif // !_WIN32

 namespace faiss {

@@ -33,7 +33,7 @@ namespace {
 /// std::vector that deletes its contents
 struct IOHookTable : std::vector<InvertedListsIOHook*> {
     IOHookTable() {
-#ifndef _MSC_VER
+#ifndef _WIN32
         push_back(new OnDiskInvertedListsIOHook());
 #endif
         push_back(new BlockInvertedListsIOHook());
  1. Build with
    cmake -B build-release . -G "MinGW Makefiles" -DCMAKE_BUILD_TYPE=Release
    cmake --build build-release -- -j<number of cores here>
    # copy dlls for use outside of MSYS2
    cp /ucrt64/bin/{libgcc_s_seh-1.dll,libwinpthread-1.dll,libblas.dll,libgomp-1.dll,liblapack.dll,libgfortran-5.dll,libquadmath-0.dll,libstdc++-6.dll} ./build-release/
  2. Done! Just remember to copy all dlls under build-release on deployment.
wldbest commented 3 months ago

Hi, @ma-chengyuan could you please share the dll file? I would like to use the windows version of sqlite-vss, but the make process seems too complicated. If you can share it, I would like to try it to see if it works in my environment.

bqhuyy commented 2 months ago

@ma-chengyuan hi, I follow your instruction. The dlls only work with sqlite tool only. The built-in version sqlite3 (installed using .msi file on Windows) or node sqlite3 cannot load that dll as extension. Here is the message error: The specified module could not be found