losfair / mvsqlite

Distributed, MVCC SQLite that runs on FoundationDB.
https://github.com/losfair/mvsqlite/wiki
Apache License 2.0
1.39k stars 39 forks source link

Use from Python `sqlite` / creating a standalone `libsqlite3.so` #63

Closed paulreimer closed 2 years ago

paulreimer commented 2 years ago

This is a very exciting project! sqlite is a great interface, and FoundationDB is a great DB.

I'd like to use mvsqlite from Python; it doesn't have to be via the stdlib's import sqlite (i.e. if a different 3rd-party library makes the integration easier), but FUSE is not an option in the context I am using (a unikernel). In fact, even if the LD_PRELOAD approach was usable by Python libraries, this unikernel context does not support that. For my use-case, I'd only be using mvsqlite and I'd be OK if the default sqlite library was overridden so that all sqlite connections use mvsqlite and the regular FS-based sqlite was not available.

So, I'm thinking I'd need a different integration method, which would be along the lines of creating a standalone libsqlite3.so which can override the default system lib. I'm not quite sure how to build something like that, but I could also see it being helpful for other use-cases which are mvsqlite-only and want to avoid LD_PRELOAD.

Does this make sense as a supported use-case? I'm not sure how to do it; I can poke around at modifying the mvsqlite-preload/Makefile for my own purposes, but it could be nice to have support for such an integration method in the repo.

losfair commented 2 years ago

Thanks for bringing up the unikernel use case!

even if the LD_PRELOAD approach was usable by Python libraries, this unikernel context does not support that.

I wonder what kind of dynamic linking does this unikernel support? It's hard to imagine a case where loading .so is supported but LD_PRELOAD is not...

paulreimer commented 2 years ago

Excitingly, I was wrong about LD_PRELOAD not being supported -- I'm able to use that successfully to load libmvsqlite_preload.so and use mvsqlite in this unikernel context! (though it would be nice if I could build a libsqlite3.so that is mvsqlite-only, and use that when building native executables)

So, on the Python sqlite3 front, I think I just need to figure out how to get something like LD_PRELOAD working with the module's dynamic library.

paulreimer commented 2 years ago

Even more excitingly, LD_PRELOAD also works fine with Python, for overriding the sqlite3 module!

I guess I could amend/reword this issue to be specifically about building a standalone libsqlite3.so, which only supports mvsqlite but doesn't require LD_PRELOAD? I would use such a thing if it existed, but I am currently able to make progress without it.

paulreimer commented 2 years ago

Also, BTW, I had to run execstack -c libmvsqlite_preload.so to clear the ELF executable stack flag. I haven't seen any issues yet from doing this, so maybe it should be done on the official/released preload library?

losfair commented 2 years ago

Not sure why the shared library has executable stack enabled - maybe related to the Rust->C linking process?

Fixed by adding a linker option to force noexecstack. Doesn't seem to break anything.

losfair commented 2 years ago

I guess I could amend/reword this issue to be specifically about building a standalone libsqlite3.so, which only supports mvsqlite but doesn't require LD_PRELOAD?

Currently this should work if you link libsqlite3.so itself dynamically to libmvsqlite_preload.so by adding the -lmvsqlite_preload compile option.

To build everything into one library though things start to become a bit complex - I don't know of a way to override ELF symbols at build time (unless we post-process the ELF with custom tools)?

losfair commented 2 years ago

Released v0.1.18 with noexecstack enabled.

losfair commented 2 years ago

https://github.com/losfair/mvsqlite/pull/68 added a patched libsqlite3.so build target. This is the mvSQLite build that does not require LD_PRELOAD.

v0.1.18-1 includes this update.

paulreimer commented 2 years ago

Thanks for those changes! I'll test out the new patched library shortly.

One thing I have been thinking about: in terms of the library name, maybe it should be libmvsqlite.so (or libmvsqlite3.so)? (to avoid possible confusion with existing libsqlite3.so files).

For my current use-case w/Python, I would take that file and rename it to libsqlite3.so.0, but when building native executables, it would be just as easy to use -lmvsqlite instead of -lsqlite3

paulreimer commented 2 years ago

I am happy to report that the patched lib works great! No LD_PRELOAD needed, and the noexecstack also works as expected.

Not sure if I missed a step in building (I was manually following the "Build (patched libsqlite3)" steps in ci.yml); I found that I had to copy sqlite3.h from the SQLite amalgamation into mvsqlite-preload/, and add -I. to the build-patched-sqlite3 Makefile target command. But personally, I'll likely download/use the released binary in a subsequent release that has it.

paulreimer commented 2 years ago

Also, I think "Python" could be added to the "App and Library Compatibility Table"; I've successfully used Python 3.10 -- via the stdlib's sqlite3 library -- with both VFS and FUSE.

losfair commented 2 years ago

Thanks for the testing! Updated the docs.

paulreimer commented 2 years ago

A standalone libsqlite3.so is included in the release binaries now (I'm using it w/Python and it works great!), thanks for adding that -- I'm finding it very helpful, and my preferred deployment strategy!