asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 59 forks source link

Load failure on MacOS Mojave #8

Closed simoncollins closed 1 year ago

simoncollins commented 1 year ago

Hi, this extension looks awesome. Just a heads up that loading it on my Mac running 10.14.6 gives the following error.

sqlite> .load ./vector0
Error: dlopen(./vector0.dylib, 10): no suitable image found.  Did find:
    ./vector0.dylib: cannot load 'vector0.dylib' (load command 0x80000034 is unknown)

From googling the error, I suspect this is a MacOS version issue.

asg017 commented 1 year ago

Thanks for filing!

Yeah, these SQLite extensions are built on Github Actions, specifically the macos-latest runner image, which is Monterey 12. I could lower it to Big Sur 11, but there are no runner images for Mojave or Catalina, as they were deprecated last year.

One option is compiling these extensions yourself. I don't have great documentation for this, but you could do something like this:

git clone --recursive https://github.com/asg017/sqlite-vss.git
cd sqlite-vss

# build sqlite dependency
cd vendor
./get_sqlite.sh 
cd sqlite
./configure
make

# build sqlite-vector extension
cd ../sqlite-vector/
make loadable

# build sqlite-vss extension
cd ../../
make loadable

Then vss0.dylib will be available in dist/debug/vss0.dylib, and vector0.dylib will be in vendor/sqlite-vector/dist/debug/vector.dylib. This will require cmake, which brew install cmake should get you.

Let me know if this works for you! Will add better documentation for this soon.

simoncollins commented 1 year ago

Great .. thanks for the tips. I'll definitely have a go at compiling it from source. I was also going to look at compiling for ARM as well for my other Mac.

simoncollins commented 1 year ago

I got as far as compiling the sqlite dependency but failed when running make loadable for sqlite-vendor. Seems like it's somehow picking up an older version of sqlite3.h that doesn't have xShadowName as part of the struct and SQLITE_INNOCUOUS. Can't remember from days of cpp coding how the path to header files is setup. I do have a few versions of that header lying around in other installations so that's my best guess.

$ make loadable
mkdir -p dist/debug
mkdir -p dist/release
cmake -B build; make -C build
-- The C compiler identification is AppleClang 10.0.1.10010046
-- The CXX compiler identification is AppleClang 10.0.1.10010046
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Using the multi-header code from /Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/vendor/json/include/
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/build
[ 33%] Building CXX object CMakeFiles/sqlite-vector.dir/src/extension.cpp.o
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:481:21: error: excess elements in struct initializer
  /* xShadowName */ 0
                    ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:556:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_version",     0,  NULL, vector_version,   SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS },
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:557:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_debug",       0,  NULL, vector_debug,     SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS },
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:558:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_debug",       1,  NULL, vector_debug,     SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS },
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:559:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_length",      1,  NULL, vector_length,    SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS },
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:560:77: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_value_at",    2,  NULL, vector_value_at,  SQLITE_UTF8|SQLITE_INNOCUOUS},
                                                                            ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:561:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_from_json",   1,  NULL, vector_from_json, SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS},
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:562:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_to_json",     1,  NULL, vector_to_json,   SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS},
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:563:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_from_blob",   1,  NULL, vector_from_blob, SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS},
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:564:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_to_blob",     1,  NULL, vector_to_blob,   SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS},
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:565:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_from_raw",    1,  NULL, vector_from_raw,  SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS},
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:566:98: error: use of undeclared identifier 'SQLITE_INNOCUOUS'
    { (char*) "vector_to_raw",      1,  NULL, vector_to_raw,    SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS},
                                                                                                 ^
/Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:568:26: error: invalid application of 'sizeof' to an incomplete type 'const
      struct (anonymous struct at /Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/src/extension.cpp:548:18) []'
    for(int i=0; i<sizeof(aFunc)/sizeof(aFunc[0]); i++){
                         ^~~~~~~
13 errors generated.
make[3]: *** [CMakeFiles/sqlite-vector.dir/src/extension.cpp.o] Error 1
make[2]: *** [CMakeFiles/sqlite-vector.dir/all] Error 2
make[1]: *** [all] Error 2
make: *** [dist/debug/vector0.dylib] Error 2
asg017 commented 1 year ago

Ah my bad, forgot a step - before running make loadable inside vendor/sqlite-vector, you'll also have to build SQLite for that subproject. Here's the full updated instructions:

# step 1: clone sqlite-vss repo and it's submodules
git clone --recursive https://github.com/asg017/sqlite-vss.git
cd sqlite-vss

# step 2: build sqlite dependency for sqlite-vss in `sqlite-vss/vendor/`
cd vendor
./get_sqlite.sh 
cd sqlite
./configure
make

# step 3: build sqlite dependency for sqlite-vector in `sqlite-vss/vendor/sqlite-vector/vendor`
cd ../sqlite-vector/vendor
./get_sqlite.sh 
cd sqlite
./configure
make

# step 4: build sqlite-vector in `sqlite-vss/vendor/sqlite-vector`
cd ../../
make loadable

# step 5: build sqlite-vss extension in `sqlite-vss/`
cd ../../
make loadable

Step 3 was the missing part last time, and once you build that, then your sqlite-vector build shouldn't have any "many xShadowName" errors.

Let me know if that works! Definitely will clean this all up to make building yourself easier.

simoncollins commented 1 year ago

Getting there! .. I now get to the final make loadable and then get a type error. I'm guessing that might be an MacOS or command line tools version issue.

$ make loadable
cmake -B build; make -C build
-- Using the multi-header code from /Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/vendor/json/include/
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/simon/Documents/projects/sqlite-vss/vendor/sqlite-vector/build
Consolidate compiler generated dependencies of target sqlite-vector
[ 33%] Building CXX object CMakeFiles/sqlite-vector.dir/src/extension.cpp.o
[ 66%] Building CXX object CMakeFiles/sqlite-vector.dir/src/vectors.cpp.o
[100%] Linking CXX shared library vector0.dylib
[100%] Built target sqlite-vector
cp build/vector0.dylib dist/debug/vector0.dylib
MacBook-Pro-5:sqlite-vector simon$ cd ../../
MacBook-Pro-5:sqlite-vss simon$ make loadable
mkdir -p dist/debug
mkdir -p dist/release
cmake -B build; make -C build
-- The C compiler identification is AppleClang 10.0.1.10010046
-- The CXX compiler identification is AppleClang 10.0.1.10010046
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -Xclang -fopenmp (found version "3.1")
-- Found OpenMP_CXX: -Xclang -fopenmp (found version "3.1")
-- Found OpenMP: TRUE (found version "3.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Could NOT find MKL (missing: MKL_LIBRARIES)
-- Looking for sgemm_
-- Looking for sgemm_ - not found
-- Looking for dgemm_
-- Looking for dgemm_ - found
-- Found BLAS: /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Accelerate.framework
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Accelerate.framework;-lm;-ldl
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/simon/Documents/projects/sqlite-vss/build
[  0%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/AutoTune.cpp.o
[  1%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/Clustering.cpp.o
[  1%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IVFlib.cpp.o
[  2%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/Index.cpp.o
[  2%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/Index2Layer.cpp.o
[  3%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexAdditiveQuantizer.cpp.o
[  3%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexBinary.cpp.o
[  4%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexBinaryFlat.cpp.o
[  4%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexBinaryFromFloat.cpp.o
[  5%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexBinaryHNSW.cpp.o
[  5%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexBinaryHash.cpp.o
[  6%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexBinaryIVF.cpp.o
[  6%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexFlat.cpp.o
[  8%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexFlatCodes.cpp.o
[  8%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexHNSW.cpp.o
[  9%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexIDMap.cpp.o
[  9%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexIVF.cpp.o
[ 10%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexIVFAdditiveQuantizer.cpp.o
[ 10%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexIVFFlat.cpp.o
[ 11%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexIVFPQ.cpp.o
/Users/simon/Documents/projects/sqlite-vss/vendor/faiss/faiss/IndexIVFPQ.cpp:937:48: error: unknown type name '__m128i_u'; did you mean '__m128i'?
                        _mm_loadu_si128((const __m128i_u*)(code + m));
                                               ^~~~~~~~~
                                               __m128i
/Library/Developer/CommandLineTools/usr/lib/clang/10.0.1/include/emmintrin.h:30:19: note: '__m128i' declared here
typedef long long __m128i __attribute__((__vector_size__(16)));
                  ^
1 error generated.
make[3]: *** [vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/IndexIVFPQ.cpp.o] Error 1
make[2]: *** [vendor/faiss/faiss/CMakeFiles/faiss_avx2.dir/all] Error 2
make[1]: *** [all] Error 2
make: *** [dist/debug/vss0.dylib] Error 2
asg017 commented 1 year ago

Could you edit the CMakeLists.txt file on line 24 and change faiss_avx2 to faiss? Specifically here:

https://github.com/asg017/sqlite-vss/blob/ecc02547ce5095532d11e401dd399bc65191fcd2/CMakeLists.txt#L24

The faiss_avx2 includes some extra CPU instructions to make things faster, but it seems like your machine may not have that enabled or something.

Also to note - thanks again for sharing your errors, this is very helpful!

simoncollins commented 1 year ago

That did it! Yeah my Macbook Pro is an early 2015 model so a bit of an edge case!

Now it compiles fine. However I get at segfault when trying to create the table example. I'm loading the extensions into the vendored sqlite.

sqlite> .load ../../dist/debug/vector0
please 3.24.0
vector0 rc=0
sqlite> .load ../../dist/debug/vss0
sqlite> create virtual table vss_articles using vss0(
   ...>   headline_embedding(384),
   ...>   description_embedding(384),
   ...> );
Segmentation fault: 11

Also to note - thanks again for sharing your errors, this is very helpful!

No worries ... happy to help, although as I said, I suspect my situation is an edge case. I'll try compiling it on my M1 Macbook and see how I go as well.

bkono commented 1 year ago

FWIW, I got this building on an M1 following the same steps without tripping up toooo much. Key was ensuring llvm was installed via brew, and setting envs for CC, CXX, LDFLAGS and CPPFLAGS to the right paths in homebrew.

There are some typos in the blog post, but the examples/headlines build scripts work (assuming appropriate python env w/ conda, torch, sentence_transformers), minus the IVF setup steps.

Example flags given an /opt/homebrew base (in fish):

set -gx CC="/opt/homebrew/opt/llvm/bin/clang"
set -gx CXX "/opt/homebrew/opt/llvm/bin/clang++"
set -gx LDFLAGS "-L/opt/homebrew/opt/llvm/lib"
set -gx CPPFLAGS "-I/opt/homebrew/opt/llvm/include"

Given how helpful those steps were for getting the deps built, probably worth adding to a doc.

Thanks for putting this together!

ocordeiro commented 1 year ago

I compiled it for ARM to work on M1. It is currently available in my project for download at: https://github.com/ocordeiro/chico/tree/master/sqlite-vss

asg017 commented 1 year ago

Alright, I've made many updates and documentation changes in v0.0.2 (2023-04-10) to make compiling on Macs more clear, including:

I opened separate issues for the avx2 compilation error and the missing builds for Mac M1.

Thank you all for your help in this thread, will close in favor of the above issues!