asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 59 forks source link

Include Mac M1 ARM pre-compiled builds #13

Closed asg017 closed 1 year ago

asg017 commented 1 year ago

Currently, Github Actions only has Mac runners for x86_64, and we don't have pre-built loadable extensions for Mac M1 ARM platforms. This effects all distribution platforms, including the extensions on Github Releases, npm, pip, and Deno.

It should be possible to compile an M1 from a x86_64 runner, which I've done for extensions written in Rust and C. But C++ and CMake makes this more challenging, and I haven't figured it out yet. Plus, cross-compiling comes with a heavy performance regression, in my experience.

The easiest solution will be when Github Actions supports Mac M1 runners, but that's not until Q4 2023. I could also just buy the cheapest M1 mac mini myself and register it as a self-hosted runner, but I don't have the cash for that ( if you're willing to sponsor this, let me know!). Could also use a service like AWS's Mac M1 ec2 instances or macstadium, but those are expensive and seem complicated to setup with Github Actions.

So, the most likely options are:

  1. Figure out cross-compilation for C++/cmake/faiss on a Mac Github Action runner
  2. Get a sponsor for a physical low-tier Mac M1 mini, and run as a self-hosted Github Action runner
  3. Wait until the end of the year when Github supports Mac M1 runners natively
JpCapdevila commented 1 year ago

Hey @asg017 I would be happy to sponsor the Mac m1, shoot me an email, I can probably ship you one today.

juan@capdevila.me

Thanks for all the cool things you are building.

thomasantony commented 1 year ago

You don't need M1 runners to build wheels for M1 architecture. Something like cibuildwheel can do that using cross compilers.

asg017 commented 1 year ago

I tried a few different cross compiling techniques for this, but none of them worked well. The biggest issue was getting it to work on a Github Action MacOS runner, but the way Faiss compiles is super tricky and I couldn't get it to work. It would be easier for extensions written in plain C or simple C++, but the Faiss dependency definitely complicates things.

Also even when they work, there's a big performance hit running cross-compiled extensions. It varies based which cross compilers, but since sqlite-vss is performance sensitive, compiling it natively would be best

thomasantony commented 1 year ago

Okay. I have been able to build it locally on my M1. I have been trying out a few things here: https://github.com/thomasantony/sqlite-vss/tree/fix-ci-builds

No luck so far with the cross compiler as it fails to find "OpenMP" for some reason. Once you have access to an M1 runner, you may want to look at these changes regardless :)

asg017 commented 1 year ago

Thanks for sharing! Yeah the "cant find openpm" was the main problem when trying to cross compile. The CMakeLists.txt changes you have their to build vendor/sqlite looks pretty interesting...

I actually now have a self-hosted M1 runner in the works: download the "sqlite-vss-macos-arm" artifact here to try it out: https://github.com/asg017/sqlite-vss/actions/runs/4672962033

Need to fix a few things before creating an official "release" for it, but M1 pre-built binaries are coming soon!

thomasantony commented 1 year ago

For building it locally, all I had to do was install clang v16 using Homebrew and set the CC environment variable to point to it.

My branch also uses pyproject.toml and scikit-build to sort of initiate the build from the python side instead of using the Makefile. I am not sure if my changes break the nodejs stuff though.

asg017 commented 1 year ago

As of v0.0.4, there are now pre-compiled builds of sqlite-vss for Mac M1/M2 computers. Should work with npm install sqlite-vss and pip install sqlite-vss.

https://github.com/asg017/sqlite-vss/releases/tag/v0.0.4

It's build on a self-hosted Mac M1 github action runner. If anyone has any trouble running it, please file a new issue!

simonw commented 1 year ago

This doesn't seem to be working for me on an M2 Mac:

% datasette content.db
# ...
sqlite3.OperationalError: dlopen(/Users/simon/.local/pipx/venvs/datasette/lib/python3.11/site-packages/sqlite_vss/vss0.dylib, 0x000A): Library not loaded: /opt/homebrew/opt/llvm/lib/libomp.dylib
  Referenced from: <071103E8-299B-316E-BEBE-736513D2759C> /Users/simon/.local/pipx/venvs/datasette/lib/python3.11/site-packages/sqlite_vss/vss0.dylib
  Reason: tried: '/opt/homebrew/opt/llvm/lib/libomp.dylib' (no s

Here are the package versions I have installed:

% pipx runpip datasette freeze | grep vss
datasette-sqlite-vss==0.0.4
sqlite-vss==0.0.4
asg017 commented 1 year ago

@simonw can you try brew install libomp and try again?

jnward commented 1 year ago

I'm having the same issue as @simonw on my M1 mac, brew install libomp didn't help. I also tried recompiling python to no avail.

asg017 commented 1 year ago

@jnward what does 'brew info libomp' return?

jnward commented 1 year ago
==> libomp: stable 16.0.2 (bottled) [keg-only]
LLVM's OpenMP runtime library
https://openmp.llvm.org/
/opt/homebrew/Cellar/libomp/16.0.2 (7 files, 1.7MB)
  Poured from bottle using the formulae.brew.sh API on 2023-04-28 at 16:53:14
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/libomp.rb
License: MIT
==> Dependencies
Build: cmake ✘, lit ✘
==> Caveats
libomp is keg-only, which means it was not symlinked into /opt/homebrew,
because it can override GCC headers and result in broken builds.

For compilers to find libomp you may need to set:
  export LDFLAGS="-L/opt/homebrew/opt/libomp/lib"
  export CPPFLAGS="-I/opt/homebrew/opt/libomp/include"
==> Analytics
install: 2,795 (30 days), 94,095 (90 days), 1,218,689 (365 days)
install-on-request: 1,079 (30 days), 13,043 (90 days), 151,003 (365 days)
build-error: 0 (30 days)
asg017 commented 1 year ago

Thanks for sharing, will take a closer look in a few days. In the meantime you can try compiling it yourself and loading vector0 and vss0 manually in Python, which isn't ideal but should unblock for now. The docs.md file has instructions you can follow

asg017 commented 1 year ago

@jnward alright I just published v0.0.6-alpha.1, which completely removes the need for brew install libomp (and instead just statically compiles openmp into the extension itself).

Could you give that version a try? Should be able to pip install sqlite-vss==0.0.6-alpha.1

jnward commented 1 year ago

Hey this version seems to work, thanks so much!

asg017 commented 1 year ago

Thanks jake! Will publish v0.0.6 shortly, then will re-close this issue. Thanks again for reporting this!

asg017 commented 1 year ago

v0.0.6 is now released with this fixed Mac M1 build. Will also release an unceremonious v0.1.0 minor bump soon.

This issue should now be fixed: Mac M1 pre-compiled builds are now available, and the libomp shared library is no longer required.

Closing again, please let me know if you have any trouble running these extensions on Mac M1!

simonw commented 1 year ago

This now installs cleanly for me on an M2 Mac! datasette install datasette-sqlite-vss.