utelle / SQLite3MultipleCiphers

SQLite3 encryption extension with support for multiple ciphers
https://utelle.github.io/SQLite3MultipleCiphers/
MIT License
390 stars 73 forks source link

How to compile for WASM? #129

Closed dlindenkreuz closed 10 months ago

dlindenkreuz commented 10 months ago

I saw that #88 adds support for WASM targets. However, I am not able to compile SQLite3-MC using these steps:

# ... prepare emscripten environment according to https://sqlite.org/wasm/doc/trunk/emscripten.md

$ git clone git@github.com:sqlite/sqlite.git --depth=1 --branch=version-3.44.0
$ cd sqlite
$ ./configure --enable-all

# use readymade sqlite3-mc amalgamation
$ cp ../sqlite3mc-1.7.4-sqlite-3.44.0-amalgamation/sqlite3mc_amalgamation.c sqlite3.c
$ cp ../sqlite3mc-1.7.4-sqlite-3.44.0-amalgamation/sqlite3mc_amalgamation.h sqlite3.h

$ cd ext/wasm
$ make

First, I ran into this error:

In file included from api/sqlite3-wasm.c:214:
./../../sqlite3.c:120400:17: error: no member named 'nTableLock' in 'struct Parse'
 120400 |     if( pParse->nTableLock>0 && db->init.busy==0 ){

I found this issue and added #define SQLITE_USER_AUTHENTICATION 0 to sqlite3.c to turn off the conflicting feature flags.

This got me to the point where the linker fails:

using emcc version [3.1.49]
This is an SEE build.
Development build. Use '/Applications/Xcode.app/Contents/Developer/usr/bin/make release' for a smaller release build.
Setting up build [sqlite3-vanilla]:  ./jswasm/sqlite3.js
Setting up build [sqlite3-esm]:  ./jswasm/sqlite3.mjs
Setting up build [sqlite3-bundler-friendly]: ./jswasm/sqlite3-bundler-friendly.mjs
Setting up build [sqlite3-node]: ./jswasm/sqlite3-node.mjs
cc -O0 -o c-pp c-pp.c ../../sqlite3.c '-DCMPP_DEFAULT_DELIM="//#"' -I../.. \
        -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_UTF16 \
        -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_OMIT_WAL -DSQLITE_THREADSAFE=0 \
        -DSQLITE_TEMP_STORE=3
ld: Undefined symbols:
  _SecRandomCopyBytes, referenced from:
      _entropy in sqlite3-f1a153.o
  _kSecRandomDefault, referenced from:
      _entropy in sqlite3-f1a153.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [c-pp] Error 1

My build environment:

cc @nullrocket — how did you manage to build SQLite3-MC for WASM?

dlindenkreuz commented 10 months ago

Huh, I managed to get a working build by patching sqlite3/ext/wasm/GNUmakefile in addition to the changes above:

bin.c-pp := ./c-pp
$(bin.c-pp): c-pp.c $(sqlite3.c) $(MAKEFILE)
    $(CC) -O0 -o $@ c-pp.c $(sqlite3.c) '-DCMPP_DEFAULT_DELIM="//#"' -I$(dir.top) \
        -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_UTF16 \
        -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_OMIT_WAL -DSQLITE_THREADSAFE=0 \
-       -DSQLITE_TEMP_STORE=3
+       -DSQLITE_TEMP_STORE=3 -framework Security -framework Foundation

This is quite hacky though. Would appreciate some guidance if there is a nicer way!

Inspiration / clues taken from:

utelle commented 10 months ago

I saw that #88 adds support for WASM targets.

For #88 the code related to generating random bytes was adjusted to select code based on the preprocessor symbol __WASM__. That happened in September 2022. In October 2023 I introduced another modification, namely to use SecRandomCopyBytes() on Apple platforms. The code checks first for the preprocessor symbol __APPLE__ and only if that is not defined the symbol __WASM__ will be checked.

However, I am not able to compile SQLite3-MC using these steps:

# ... prepare emscripten environment according to https://sqlite.org/wasm/doc/trunk/emscripten.md

$ git clone git@github.com:sqlite/sqlite.git --depth=1 --branch=version-3.44.0
$ cd sqlite
$ ./configure --enable-all

# use readymade sqlite3-mc amalgamation
$ cp ../sqlite3mc-1.7.4-sqlite-3.44.0-amalgamation/sqlite3mc_amalgamation.c sqlite3.c
$ cp ../sqlite3mc-1.7.4-sqlite-3.44.0-amalgamation/sqlite3mc_amalgamation.h sqlite3.h

$ cd ext/wasm
$ make

First, I ran into this error:

In file included from api/sqlite3-wasm.c:214:
./../../sqlite3.c:120400:17: error: no member named 'nTableLock' in 'struct Parse'
 120400 |     if( pParse->nTableLock>0 && db->init.busy==0 ){

I found this issue and added #define SQLITE_USER_AUTHENTICATION 0 to sqlite3.c to turn off the conflicting feature flags.

Yes, it is known that the user authentication extension has conflicts with certain other SQLite configuration options. Since this extension isn't useful in the WASM context anyway, it is the right approach to disable it.

This got me to the point where the linker fails:

using emcc version [3.1.49]
This is an SEE build.
Development build. Use '/Applications/Xcode.app/Contents/Developer/usr/bin/make release' for a smaller release build.
Setting up build [sqlite3-vanilla]:  ./jswasm/sqlite3.js
Setting up build [sqlite3-esm]:  ./jswasm/sqlite3.mjs
Setting up build [sqlite3-bundler-friendly]: ./jswasm/sqlite3-bundler-friendly.mjs
Setting up build [sqlite3-node]: ./jswasm/sqlite3-node.mjs
cc -O0 -o c-pp c-pp.c ../../sqlite3.c '-DCMPP_DEFAULT_DELIM="//#"' -I../.. \
      -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_UTF16 \
      -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_OMIT_WAL -DSQLITE_THREADSAFE=0 \
      -DSQLITE_TEMP_STORE=3
ld: Undefined symbols:
  _SecRandomCopyBytes, referenced from:
      _entropy in sqlite3-f1a153.o
  _kSecRandomDefault, referenced from:
      _entropy in sqlite3-f1a153.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [c-pp] Error 1

Obviously, the preprocessor symbol __APPLE__ is defined and therefore SecRandomCopyBytes() is selected to generate truely random bytes. On Apple platforms you need to specify the linker option -framework Security to get rid of the linker error. However, I don't know whether this will work in conjunction with WASM.

In my project SQLite3MultipleCiphers-cb the GitHub workflow includes builds for WASM platforms. Maybe you take a look...

utelle commented 10 months ago

I didn't see your new post, before I submitted my previous answer.

Huh, I managed to get a working build by patching sqlite3/ext/wasm/GNUmakefile in addition to the changes above:

bin.c-pp := ./c-pp
$(bin.c-pp): c-pp.c $(sqlite3.c) $(MAKEFILE)
  $(CC) -O0 -o $@ c-pp.c $(sqlite3.c) '-DCMPP_DEFAULT_DELIM="//#"' -I$(dir.top) \
      -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_UTF16 \
      -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_OMIT_WAL -DSQLITE_THREADSAFE=0 \
-     -DSQLITE_TEMP_STORE=3
+     -DSQLITE_TEMP_STORE=3 -framework Security -framework Foundation

Adding -framework Security should be enough.

This is quite hacky though. Would appreciate some guidance if there is a nicer way!

Hm, the autoconf/automake files coming with SQLite3 Multiple Ciphers include the flag -framework Security. However, you used the WASM build files from SQLite. And because SQLite itself doesn't make use of SecRandomCopyBytes() it doesn't add this flag.

If the preprocessor symbol __WASM__ is defined in your environment, then we might get away with changing the order of checks in the SQLite3 Multiple Ciphers code to check for the symbol __WASM__ first.

nullrocket commented 10 months ago

@dlindenkreuz It has been a while since I worked on it and it was specifically to interface browser flavored wasm to dart in an extension to another flutter package, and probably a bit older version of sqlite3mc. The APPLE preprocessor symbol didn't exist then, or if it did it is possible I never compiled on my mac for wasm.

https://github.com/thinkassembly/sqlite3mc.dart

You could maybe figure out what you need to build though from these files, but you would have to adjust it to actually work for your environment, also you need a need a VFS, the one in this project is interfacing with dart / javascript and a custom VFS.

If you need to work with it in rust I have a build for rusqlite to use sqlite3mc in wasm and natve hanging around somewhere that is probably a little clearer but still using a dart / javascript shim for the VFS

These are the three files that probably get you somewhere as far as building. https://github.com/thinkassembly/sqlite3mc.dart/blob/main/packages/sqlite3mc_drift_database/Makefile

https://github.com/thinkassembly/sqlite3mc.dart/blob/main/packages/sqlite3mc_drift_database/wasm/assets/sqlite3mc-wasm/CMakeLists.txt

https://github.com/thinkassembly/sqlite3mc.dart/blob/main/packages/sqlite3mc_drift_database/wasm/assets/sqlite3mc-wasm/toolchain.cmake

dlindenkreuz commented 10 months ago

Thanks for the pointers, @utelle and @nullrocket! I managed to get an error-free build using these patches, no linking to Apple frameworks necessary (which makes a lot of sense 😉):

Details

```patch diff --git a/sqlite3mc_amalgamation.c b/sqlite3mc_amalgamation.c index a2b1b4e67..25d4583c5 100644 --- a/sqlite3mc_amalgamation.c +++ b/sqlite3mc_amalgamation.c @@ -1,3 +1,6 @@ +#undef __APPLE__ +#define SQLITE_USER_AUTHENTICATION 0 + /* ** Name: sqlite3mc.c ** Purpose: Amalgamation of the SQLite3 Multiple Ciphers encryption extension for SQLite ``` ```patch diff --git a/ext/wasm/GNUmakefile b/ext/wasm/GNUmakefile index c0cab212d..6729d6a54 100644 --- a/ext/wasm/GNUmakefile +++ b/ext/wasm/GNUmakefile @@ -292,7 +292,7 @@ $(bin.c-pp): c-pp.c $(sqlite3.c) $(MAKEFILE) $(CC) -O0 -o $@ c-pp.c $(sqlite3.c) '-DCMPP_DEFAULT_DELIM="//#"' -I$(dir.top) \ -DSQLITE_OMIT_LOAD_EXTENSION -DSQLITE_OMIT_DEPRECATED -DSQLITE_OMIT_UTF16 \ -DSQLITE_OMIT_SHARED_CACHE -DSQLITE_OMIT_WAL -DSQLITE_THREADSAFE=0 \ - -DSQLITE_TEMP_STORE=3 + -DSQLITE_TEMP_STORE=3 -D__WASM__=1 define C-PP.FILTER # Create $2 from $1 using $(bin.c-pp) # $1 = Input file: c-pp -f $(1).js ```

I was also able to expose sqlite3mc_config and other functions:

Details

```patch diff --git a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see b/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see index 83f3a97db..a75dfa684 100644 --- a/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see +++ b/ext/wasm/api/EXPORTED_FUNCTIONS.sqlite3-see @@ -3,3 +3,10 @@ _sqlite3_key_v2 _sqlite3_rekey _sqlite3_rekey_v2 _sqlite3_activate_see +_sqlite3mc_config +_sqlite3mc_config_cipher +_sqlite3mc_codec_data +_sqlite3mc_cipher_count +_sqlite3mc_cipher_index +_sqlite3mc_cipher_name +_sqlite3mc_register_cipher ``` ```patch diff --git a/ext/wasm/api/sqlite3-api-glue.js b/ext/wasm/api/sqlite3-api-glue.js index f23a02366..5d5f060f8 100644 --- a/ext/wasm/api/sqlite3-api-glue.js +++ b/ext/wasm/api/sqlite3-api-glue.js @@ -317,7 +317,8 @@ globalThis.sqlite3ApiBootstrap.initializers.push(function(sqlite3){ ["sqlite3_value_type", "int", "sqlite3_value*"], ["sqlite3_vfs_find", "*", "string"], ["sqlite3_vfs_register", "int", "sqlite3_vfs*", "int"], - ["sqlite3_vfs_unregister", "int", "sqlite3_vfs*"] + ["sqlite3_vfs_unregister", "int", "sqlite3_vfs*"], + ["sqlite3mc_config", "int", "sqlite3*", "string", "int"] + // other function signatures as desired... ]/*wasm.bindingSignatures*/; if(false && wasm.compileOptionUsed('SQLITE_ENABLE_NORMALIZE')){ ```

I now get obscure errors at runtime in the browser when I open a new database (using OPFS), then db.exec("PRAGMA cipher = 'chacha20'; PRAGMA rekey = 'secret';").

RuntimeError: null function or function signature mismatch

When I use the CAPI instead of PRAGMA statements, I don't get any errors but the resulting DB file is simply not encrypted at all after inserting some rows. Queries and other operations work as expected.

// ... open db ...
sqlite.capi.sqlite3mc_config(db, "cipher", 3)
sqlite.capi.sqlite3_rekey(db, "secret", 6)

All solutions I found in the wild use different toolchains in different contexts. I am rather new to WASM / C build toolchains and I am getting lost in the complexity of the SQLite project — the above is the best I could come up with for now based on the official SQLite codebase.

@utelle Is there any interest in making this extension officially compatible with SQLite WASM builds? It would be great if one could drop in SQLite-MC as easily as the official SQLite-SEE.

utelle commented 10 months ago

Thanks for the pointers, @utelle and @nullrocket! I managed to get an error-free build using these patches, no linking to Apple frameworks necessary (which makes a lot of sense 😉):

Yes, indeed. For WASM linking to the Apple frameworks doesn't make sense. However, it shouldn't be necessary to undefine the __APPLE__ symbol. This could be tested by changing the order in which the symbols are checked. Currently, the chain begins with __APPLE_. Moving the __WASM__ check to the beginning of the chain would most likely solve the issue.

I was also able to expose sqlite3mc_config and other functions:

Ok, that seems to be the easy part.

In most cases the C API of sqlite3mc is not necessary, because almost all functionality can be accessed via SQL (pragmas or functions).

I now get obscure errors at runtime in the browser when I open a new database (using OPFS), then db.exec("PRAGMA cipher = 'chacha20'; PRAGMA rekey = 'secret';").

RuntimeError: null function or function signature mismatch

This message is too generic to be helpful. Of course, the code handling the pragma commands calls internally other C functions. I don't know anything about the internals of SQLite's WASM extension. Therefore I have no idea what's causing this message. In any case it is not a message from the SQLite WASM code.

When I use the CAPI instead of PRAGMA statements, I don't get any errors but the resulting DB file is simply not encrypted at all after inserting some rows. Queries and other operations work as expected.

sqlite3mc implements a VFS shim which sits between the SQLite core and the real VFS. When the SQLite library is initialized, the sqlite3mc VFS shim is combined with the current default VFS and the resulting VFS is made the new default. If WASM brings its own VFS and if that VFS was not the default when the library was initialized, encryption will not be used.

The encryption VFS can be enabled via URI parameters. A database URI would then look like this: file:dbfilename.db?vfs=multipleciphers-realvfsname.

All solutions I found in the wild use different toolchains in different contexts. I am rather new to WASM / C build toolchains and I am getting lost in the complexity of the SQLite project — the above is the best I could come up with for now based on the official SQLite codebase.

Up to now I never compiled or used SQLite WASM. So, I don't know anything more than you. I only know that there are developers out there that managed to get sqlite3mc compiled properly for WASM.

@utelle Is there any interest in making this extension officially compatible with SQLite WASM builds? It would be great if one could drop in SQLite-MC as easily as the official SQLite-SEE.

If I understood it correctly, the official SQLite WASM extension uses the SQLite amalgamation source. And the amalgamation can be replaced by the sqlite3mc amalgamation - just as it seems to be done for the official SQLite-SEE variant.

What needs to be understood is which VFS is actually used for WASM and how the sqlite3mc VFS shim can be combined with it properly.

In principle, I'm open to support SQLite's WASM builds, but at the moment I have no idea how much effort this may require.

utelle commented 10 months ago

In the meantime I made a few experiments with SQLite WASM.

I followed the instructions on the SQLite website to set up the development environment.

  1. I copied the SQLite3 Multiple Ciphers amalgamation source code in the top directory and renamed the file to sqlite3-see.c.
  2. I defined the symbols SQLITE_USER_AUTHENTICATION=0 (omit user authentication) and SQLITE3MC_OMIT_AES_HARDWARE_SUPPORT (omit AES hardware support)
  3. For my first tests I didn't modify the C API definitions.
  4. I invoked make in the directory ext/wasm.

The second message on screen was This is an SEE build.

Finally, I created a distribution package for testing.

Invoking the tester 1 core test resulted in the following browser output:

grafik

In the last line you see the list of VFSs, where the first one is the default VFS (here: multipleciphers-unix-none).

Thereafter I modified the demo demo-123 by adding a few cipher related output. Invoking the demo app resulted in the following browser output:

grafik

I verified that the default VFS is used by the connected database. I could query the current cipher via pragma and user-defined function (UDF). And I could set up a key for encryption.

What I don't know is, how I can inspect the actual database "file" used by the sample app.

dlindenkreuz commented 10 months ago

How exciting! 😎

Inspecting the OPFS is not straightforward. Right now, no browser other than Chrome supports this, and only via an extension. The extension adds a new tab to the Web Inspector. You can then download a file by clicking on its entry in the list.

utelle commented 10 months ago

Inspecting the OPFS is not straightforward. Right now, no browser other than Chrome supports this, and only via an extension. The extension adds a new tab to the Web Inspector.

I installed Chrome and the OPFS extension. However, with the WASM demo I didn't see any file entries in the OPFS explorer. The reason is certainly that one needs to use the "opfs" VFS to actually see file entries. I have to find out, how I can specify the VFS on opening the database in the demo app.

You can then download a file by clicking on its entry in the list.

In principle this works as I tested it with an OPFS sample application.

utelle commented 10 months ago

Well, specifying a specific VFS on opening a database is quite simple. And combining the sqlite3mc VFS shim with VFSs (like unix, unix-dotfile etc) which are implemented in pure C code works without problems.

However, the attempt to combine the VFS shim with the opfs VFS (which is a pure Javascript VFS) fails with the message

Exception: null function or function signature mismatch

And the attempt to combine the VFS shim with the kvvfs VFS (which is a hybrid C/Javascript VFS) fails with the message

Exception: SQLITE_ERROR: sqlite3 result code 1: no such vfs: multipleciphers-kvvfs

At the moment I don't know whether combining the VFS shim with the opfs or kvvfs VFS is possible, and if yes, what needs to be done to make it work.

utelle commented 10 months ago

Combining the kvvfs VFS with the encryption VFS shim is possible. I had overlooked that there can only be 2 different databases with fixed names (namely local or session) for the kvvfs VFS. Using one of the allowed names creates an encrypted database.

utelle commented 10 months ago

Instantiating the VFS shim for the real VFS opfs doesn't seem to be the culprit. At least I don't get an error, if I do it directly via the capi. However, trying to open a database connection throws the exception null function or function signature mismatch - it happens within the C function sqlite3_open_v2, but so far I was not able to find out, which function pointer is invalid and for what reason.

dlindenkreuz commented 10 months ago

find out, which function pointer is invalid and for what reason

Maybe this helps? https://github.com/emscripten-core/emscripten/issues/16305

utelle commented 10 months ago

find out, which function pointer is invalid and for what reason

Maybe this helps? emscripten-core/emscripten#16305

Thanks for the pointer. When I submitted my previous post, I already got as far as seeing the indirect function call, but didn't know how to interpret the value of the function pointer.

utelle commented 10 months ago

Maybe this helps? emscripten-core/emscripten#16305

In the end it didn't help that much, because the function pointer in question was indeed null. I resorted to a debug method of the old days when punch cards and printers were the only tools at hand: I added print statements at various places in my C code. 😄

This allowed me to track down what went wrong when and where. The answer to the problem is that the opfs implementation doesn't implement one of the I/O methods of a VFS, namely method xSectorSize. My own VFS implementation is "optimistic" and doesn't check the method pointer of the underlying VFS. SQLite behaves the same ... with exactly one exception: method xSectorSize for which SQLite checks for a null pointer. Because the method pointer of my VFS is not null, SQLite calls the method ... and that then leads to a crash.

I fixed this in my local development environment ... and now asking for an encrypted opfs works. 🎉

utelle commented 10 months ago

Commit https://github.com/utelle/SQLite3MultipleCiphers/commit/987b1e914f9b8f41f5a544ebf733083bdc1076e0 provides the necessary modifications.

utelle commented 10 months ago

@dlindenkreuz How should SQLite3 Multiple Ciphers support WASM builds in the future?

  1. Providing just a detailed description which steps are required to build WASM, complementing the original SQLite documentation
  2. Providing the modfied files as "drop-in" replacements for the original files (which may differ depending on the SQLite version)
  3. Providing prebuilt binaries (like on the SQLite download page under the heading WebAssembly & JavaScript)

Or in another different way?

dlindenkreuz commented 10 months ago

Commit https://github.com/utelle/SQLite3MultipleCiphers/commit/987b1e914f9b8f41f5a544ebf733083bdc1076e0 provides the necessary modifications.

Awesome, thanks for taking care of this!

"drop-in" replacements for the original files

Isn't this already covered by the pre-compiled amalgamation?

Providing prebuilt binaries

Eventually, the most convenient way to consume the WASM build is via npm. SQLite provides an official npm package that web devs can simply import in JavaScript: https://www.npmjs.com/package/@sqlite.org/sqlite-wasm

All of these options are helpful. Why not start with (1)? Docs are always good to have, even more so for a complex build procedure.

utelle commented 10 months ago

Commit 987b1e9 provides the necessary modifications.

Awesome, thanks for taking care of this!

A few more (minor) modifications may be necessary to further facilitate building the WASM module.

"drop-in" replacements for the original files

Isn't this already covered by the pre-compiled amalgamation?

That depends on what the goal is. If a simple "drop-in" replacement is sufficient, this can be done without changing any other file of the original SQLite WASM files (maybe except for adding a single additional preprocessor symbol to the compile step). If the additional sqlite3mc API functions should be available, too, then 2 other files need to be adjusted.

Providing prebuilt binaries

Eventually, the most convenient way to consume the WASM build is via npm. SQLite provides an official npm package that web devs can simply import in JavaScript: https://www.npmjs.com/package/@sqlite.org/sqlite-wasm

Well, I have literally no experience with npm. So, I'm probably not the right person to do the packaging.

All of these options are helpful. Why not start with (1)? Docs are always good to have, even more so for a complex build procedure.

Ok. I will write some documentation.

utelle commented 10 months ago

Today I made a first release that includes WASM support including encryption.

A (short) description of the WASM build process has been added to the SQLite3 Multiple Ciphers website.

Most likely, the WASM-related documentation will have to be extended in the future, but when and to which degree will depend on the feedback I get.