duckdb / duckdb-wasm

WebAssembly version of DuckDB
https://shell.duckdb.org
MIT License
1.05k stars 115 forks source link

Extension Loading - tracking issue #1202

Open carlopi opened 1 year ago

carlopi commented 1 year ago

We have put up a very experimental deployment at https://shellwip.duckdb.org/ that include experimental loading of extensions.

The idea is https://shell.duckdb.org/ is the stable and recommended deployment, but shellwip can be used over time to gather feedback and iterate on new features.

WARNING: Do not rely on any shellwip behaviour, and use it at your own risk!

Extensions in DuckDB

Extensions are a powerful addition to DuckDB, there are a few in-tree (as in, they are opt-in but code lives in the main duckdb/duckdb repo), then over time DuckDB Labs rolled out a few others (eg sqlite_scanner), and now an extension-template is out to facilitate anyone to build their own custom extension.

Extensions in DuckDB-Wasm

DuckDB-Wasm is currenlty deliver with a predefined set of in-tree extensions loaded by default, but we want to add runtime loading of generic extensions.

Currently live in shellwip there is a custom build of duckdb-wasm that allows loading of extensions, both via name or providing an URL.

duckdb> LOAD icu;
duckdb> LOAD "https://raw.githubusercontent.com/duckdb/duckdb-wasm-wip/main/static/assets/eh/extensions/json.extension.wasm";

Here (https://github.com/duckdb/duckdb-wasm-wip#readme) there are some additional explanations, missing features, some instructions and table with updated status for the various extensions (does it compile? does it work?).

Proper testing has NOT been performed (yet), and based on the feedback we might end up changing completely some aspects, but we hope that putting this out is a first big step in having down the line full support for extensions in duckdb-wasm.

Feedback wanted!

It would be wonderful if you could have a look, try some things, and give feedback (here, in a separate issue or Discord).

Some open issues where feedback are especially relevant are:

How this should be deployed by duckdb-wasm embedders
  • should settings be system wide, like any file in './some/relative/path/extensions' could be loaded
  • should there be the possibility of more fine grained per-session limitations
  • ...
  • JavaScript API
  • batch loading of extensions
  • registering callbacks to be provided on loading?
  • registering custom JS extension loader? (say a function that given a name, either returns the Wasm-backed array buffer OR null via arbitrary logic)
  • ...
  • Compatibility across toolchains [Rust?] Currently Emscripten C/C++ toolchain is required for building extensions, and there are questions on how much flexibility there is across different compilers/tooling versions. Rust Wasm extensions support would be great, there are some ideas on how to make this work via C-wrapper, but work is needed.

    Thanks! -> https://shellwip.duckdb.org/

    mattijn commented 1 year ago

    Short feedback regarding sqlite_scanner and reading: not working yet.

    We tried it with some in-house sqlite files, but we can reproduce it with the chinook.sqlite from here: https://www.sqlitetutorial.net/sqlite-sample-database/.

    In duckdb (this works):

    INSTALL sqlite;
    LOAD sqlite;
    CALL sqlite_attach('chinook.sqlite');
    PRAGMA show_tables;
    SELECT count(*) FROM Albums;

    MicrosoftTeams-image (17)

    in the shellwip (not yet working):

    LOAD sqlite_scanner;
    .files add
    .files list
    CALL sqlite_attach('chinook.sqlite');
    PRAGMA show_tables;
    SELECT count(*) FROM Albums;

    MicrosoftTeams-image (18)

    carlopi commented 1 year ago

    Thanks! Testing has been basically non-existent (a big todo is reusing CLI tests), since work has been mostly put towards infrastructure, glass 1/10th full is that it loads / register the function, will have to dig further, this is a good testcase.

    I have update the support table here: https://github.com/duckdb/duckdb-wasm-wip#support-table, and opened a separate issue dedicated to sqlite extension.

    mattijn commented 1 year ago

    Understandable. A pleasure to be a guinea pig👍

    holdenmatt commented 1 year ago

    I don't know if this is helpful at all, but you might look at how Pyodide enables package loading in Wasm: https://pyodide.org/en/stable/usage/loading-packages.html

    Perhaps some similarities to what you're trying to enable in duckdb-wasm. I don't know how this works under the hood, but from an end-user perspective, it seems to work well.

    (excited about this!)

    carlopi commented 1 year ago

    Thanks! This project has indeed quite some similarities, and it's nice to see they are using the same underlying mechanism for dynamic linking.

    As per the JavaScript API, this might end up 'stealing':

    await pyodide.loadPackage(["cycler", "pytz", "https://someURL/package"]);

    to load a given set of packages/extensions.

    pmm-motif commented 1 year ago

    I was playing with shellwip and JSON extension. I did limited tests on files of varying sizes but it was behaving great! I was able to load files which were failing in DuckDB WASM, e.g. an input in JSON Lines format:

    image

    Having an ability to use DuckDB WASM with JSON extension would be greatly appreciated. Are there any additional things to worry about when using experimental version in a controlled environment (other than it being tagged as experimental and the fact that using extensions with WASM is not very well tested yet)?

    carlopi commented 1 year ago

    Thanks @pmm-motif! JSON I think it's also a case where uniform support with duckdb's CLI is a great advantage.

    An update is that I am working toward distributing the wip packages so that it can be easily deployed and experimented on the wild.

    One details that I was missing is: where to look extensions up? And I believe the right answer here is allowing embedders to provide a JavaScript function like:

    function getExtensionPath(shortName, bundle) {
         return "./my/extension/folder/" + bundle + "/" + shortName + "extension.wasm";
    }

    that allows to customize extension-loading logic to whatever folder configuration you might have.

    At that point whatever seems a sensible default could be easily overridden.

    pmm-motif commented 1 year ago

    Hi @carlopi, just wanted to check what's the status of that. Tried to find some instructions how to use duckdb-wasm-wip (other than the shellwip project) but haven't found anything yet

    carstonhernke commented 1 year ago

    Very excited about this - great work! I have a bit of feedback after using the wip shell https://shellwip.duckdb.org. I was able to install & load the icu extension successfully, but the parquet extension threw an IO Error (see below). This error with the parquet extension then prevented me from loading the spatial extension directly after. (However, I just realized I can load the spatial extension on its own and this works fine)

    duckdb> install icu;
    duckdb> load icu;
    duckdb> install parquet;
    duckdb> load parquet;
    IO Error: Extension "https://shellwip.duckdb.org/static/assets/extensions/parquet.extension.wasm" could not be loaded: Could not load dynamic lib: parquet
    Error: bad export type for `_ZN6duckdb28SimpleNamedParameterFunction8ToStringEv`: undefined
    duckdb> install spatial;
    duckdb> load spatial;
    IO Error: Extension "https://shellwip.duckdb.org/static/assets/extensions/spatial.extension.wasm" could not be loaded: Could not load dynamic lib: spatial
    Error: bad export type for `_ZN6duckdb28SimpleNamedParameterFunction8ToStringEv`: undefined
    rickiesmooth commented 11 months ago

    is it possible to use this custom build of duckdb-wasm outside of https://shellwip.duckdb.org/?

    saikatmitra91 commented 3 months ago

    I just installed duckdb-wasm in my next js project and tried importing JSON.

    I keep getting Uncaught Error: Table type not specified

    I saw it's not loading JSON extension wasm file at all. when I use a similar query on shell.duckdb.org, it works ! and JSON extension file is dowloaded too.

    Is this still an open issue or am my setup is missing something ?

    chriszrc commented 1 month ago

    Thank you for investing time making extensions work in wasm, this is so so useful. This article indicates that the spatial extension should work with wasm:

    https://duckdb.org/2023/12/18/duckdb-extensions-in-wasm.html

    But trying this on the web shell gives these results:

    Screenshot 2024-05-30 at 8 21 32 AM

    Interestingly, running this locally actually works just fine (yay!), but it would be nice to demo this with the web shell. Also the docs here:

    https://duckdb.org/docs/api/wasm/extensions#list-of-officially-available-extensions

    don't list the spatial extension yet either?

    tobilg commented 1 month ago

    @chriszrc IMHO it works to load the spatial extension, at least with SQL Workbench: https://sql-workbench.com/#queries=v0,INSTALL-spatial~,LOAD-spatial~,SELECT-*-FROM-st_drivers()~