nalgeon / sqlean

The ultimate set of SQLite extensions
MIT License
3.65k stars 115 forks source link

Golang bindings and amalgamation support #92

Closed riyaz-ali closed 1 year ago

riyaz-ali commented 1 year ago

Hi 👋 First of all, thanks for this super amazing library and set of extensions! Great work 👏

Inspired sqlean.py and based on code and discussion under #69, #79, #82 and #84, I started out with sqlean.go project to provide go bindings for sqlean.

There I'm generating an amalgamation build, sqlean.c, using the script under tools/amalgamate.go.

Following are a couple of issues I ran into when trying to build the amalgamated sources:

  1. I get redefinition error for the symbol utf8_lookup. There seems to be two static global variables under src/fuzzy/translit.c and src/unicode/extension.c that share the same name.

  2. There's a random #ifdef __cplusplus in src/unicode/extension.c, presumably to close a prior extern "C" { definition, without any matching opening clause. Caught this when skimming through the generated file. This might cause unexpected build failures under a c++ compiler.

  3. I'm also facing some difficulties compiling src/regexp (because of pcre2) but I think it can be overcome by laying out the amalgamation file more intelligently. I'll give it a shot and update.

For now, to go ahead with what's working I've disabled src/fuzzy and src/regexp.

nalgeon commented 1 year ago

Thank you very much, Riyaz! I hope sqlean will be of some use to you.

The project is not intended to be amalgamaized. That would require globally unique symbol names, which makes them pretty ugly.

But you don't need an amalgamation to compile sqlean together with sqlite. My projects nalgeon/sqlite and nalgeon/sqlean.py do this just fine without amalgamation. See the Makefile for an example:

curl -L https://github.com/sqlite/sqlite/raw/master/src/test_windirent.h --output src/test_windirent.h
curl -L http://sqlite.org/$(SQLITE_RELEASE_YEAR)/sqlite-amalgamation-$(SQLITE_VERSION).zip --output sqlite.zip
unzip sqlite.zip

# ...

curl -L https://github.com/nalgeon/sqlean/archive/refs/tags/$(SQLEAN_VERSION).zip --output sqlean.zip
unzip sqlean.zip
mv sqlean-$(SQLEAN_VERSION)/src/* src
cat src/sqlite3.c init/initsqlite.c > src/sqlean.c
cat src/shell.c init/initshell.c > src/sqleanshell.c
cp init/init.h src

# ...

gcc -O1 -Isrc $(SQLITE_OPT) $(SQLEAN_OPT) $(SQLEAN_INC) $(SQLEAN_SRC) -o dist/sqlean-ubuntu $(LINK_LIB)

And here is how sqlean.py does it (redacted for brevity):

gcc -fPIC -Isqlite -I/usr/include -I/opt/hostedtoolcache/Python/3.11.4/x64/include/python3.11 -c sqlite/sqlean-crypto.c -o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-crypto.o -O1

gcc -fPIC -Isqlite -I/usr/include -I/opt/hostedtoolcache/Python/3.11.4/x64/include/python3.11 -c sqlite/sqlean-define.c -o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-define.o -O1

# ...

gcc -fPIC -Isqlite -I/usr/include -I/opt/hostedtoolcache/Python/3.11.4/x64/include/python3.11 -c sqlite/sqlean-vsv.c -o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-vsv.o -O1

gcc -fPIC -Isqlite -I/usr/include -I/opt/hostedtoolcache/Python/3.11.4/x64/include/python3.11 -c sqlite/sqlite3.c -o build/temp.linux-x86_64-cpython-311/sqlite/sqlite3.o -O1

gcc -shared -Wl,--rpath=/opt/hostedtoolcache/Python/3.11.4/x64/lib -Wl,--rpath=/opt/hostedtoolcache/Python/3.11.4/x64/lib build/temp.linux-x86_64-cpython-311/sqlite/sqlean-crypto.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-define.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-fileio.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-fuzzy.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-ipaddr.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-regexp.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-stats.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-text.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-unicode.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-uuid.o build/temp.linux-x86_64-cpython-311/sqlite/sqlean-vsv.o build/temp.linux-x86_64-cpython-311/sqlite/sqlite3.o -L/usr/lib -L/opt/hostedtoolcache/Python/3.11.4/x64/lib -o build/lib.linux-x86_64-cpython-311/sqlean/_sqlite3.cpython-311-x86_64-linux-gnu.so -lm

As for the dangling ifdef — thanks, I'll check it.

riyaz-ali commented 1 year ago

The reason I went with amalgamation is to make the library go get-able.

With python, I guess since there's a pre-package step before distribution, its feasible to download, compile and link, and ship the resulting package, and hence, not require pre-processing / amalgamation.

With golang, since there's no support for binary distribution, you'd need all the sources to be present during compilation.

Thanks for your input 👍 I'll try and explore solutions that doesn't involve amalgamaized source 🙌

nalgeon commented 1 year ago

Sorry I wasn't much help. I tried to make the symbols globally unique earlier, but the code got so ugly that I dropped the idea. I may revisit it one day.

If there is anything else I can do to make things easier for you, please let me know.

I see that sqlean.go currently includes 10 out of 12 extensions, which is pretty awesome. Thanks for implementing it!

riyaz-ali commented 1 year ago

No worries!

I found an alternative way by downloading and committing the source files from sqlean under pkg/ directory in sqlean.go, committing them there and adding some boilerplate Go code to connect everything together. This requires an ongoing maintenance effort to keep the sources in sync (this isn't in the repo yet!)

I'd still prefer managing a single amalgamated file vs. a bunch of spread out C files. Besides, I think amalgamation would be helpful in other languages and / or stack as well (like rust or android maybe 🤔).

Besides that, all the efforts you've put in #84 makes building an amalgamation file really simple and straightforward. I understand a conscious effort like this would be required for every extension you add to the main set.

Would love to hear your thoughts on this!

nalgeon commented 1 year ago

I completely agree that a single amalgamated file would be great. I just haven't figured out how to implement it without making the code a mess. I should try again :)

Thanks for the utf8_lookup fix! I just released 0.21.7 with it.