pmp-p / pglite-16.x

16.x CI build test
https://pmp-p.github.io/pglite-16.x/
Apache License 2.0
0 stars 1 forks source link

extensions/encodings packaging #1

Open pmp-p opened 2 weeks ago

pmp-p commented 2 weeks ago

normal extensions have 3 kind of files, read-only but in PG expected folders that default to subdirs of --prefix used at compile time.

./share/postgresql/extension/[EXT].control ./lib/postgresql/[EXT].so ./share/postgresql/extension/[EXT]--[VERSION].sql

and one optionnal kind : ./include/postgresql/server/extension/[EXT]/*.h

the .so must be asynchronously compiled which means one execution frame back to javscript host, to be compiled via emscripten JS api but iniated by a C call emscripten_run_preload_plugins , various outcome with custom FS can be expected.

alternative : see pyodide wasm compilation (ts/js) side.

samwillis commented 2 weeks ago

Hey, my thinking here was that we would create a .data for each extension. It would include those files (and any others) and would be layered over the base fs. The user would specify which extensions they want when initiating PGlite, they can then be downloaded async and added to the FS before PG starts. Then PG can loads the files sync.

pmp-p commented 2 weeks ago

follow up from discord: .data relies on third party python code that can only use LZ4 decompression (which add some weight to runtime for poor ratio).

i would go for a small python[1] extension builder that compiles and also package the files, stock python has a wide range of compressors available that beat lz4 hard. The decompressor could be fetched only when extensions are needed since extensions enabling is an async process.

[1] we have CPython >= 3.8 and Node >= 18 when emsdk is involved emsdk provides Node, but not CPython sidenote : Node can run wasi CPython 3.13+ single thread.

The user would specify which extensions they want when initiating PGlite, they can then be downloaded async and added to the FS before PG starts. Then PG can loads the files sync.

That is just not possible actually, due to the way dlfcn is implemented in emscripten there is no RTLD_LAZY only RTLD_NOW+RTLD_GLOBAL so PG main must already be running before extensions are compiled.