vega / vegafusion

Serverside scaling for Vega and Altair visualizations
https://vegafusion.io
BSD 3-Clause "New" or "Revised" License
303 stars 15 forks source link

WASM build using DuckDB WASM engine #394

Open jonmmease opened 9 months ago

jonmmease commented 9 months ago

In https://github.com/apache/arrow-datafusion/pull/7633 I worked through which datafusion-* crates are currently compatible with wasm-pack. These are:

The main datafusion crate (called datafusion-core in the repo) is not yet compatible with wasm. This should eventually be doable, but there are issues to work out regarding dependencies with native build requirements.

The current vegafusion-wasm crate depends on vegafusion-core, which in turn depends on datafusion-common. The main reason to depend on this crate currently is for the ScalarValue support.

I realized that the vegafusion-datafusion-udf, vegafusion-dataframe, datafusion-sql (without the datafusion-conn feature enabled), and datafusion-runtime crates all only depend only on the datafusion crates that are compatible with WASM. The only vegafusion dependency on the core datafusion crate is datafusion-sql with the datafusion-conn feature flag enabled. This means that it should be possible to compile the vegafusion-runtime to WASM and use an alternative WASM-compatible connection implementation.

One obvious candidate for a WASM-compatible connection would be a connection to DuckDB WASM. This would be very similar to how the vegafusion Python API includes support for implementing connections in Python.