Open alamb opened 9 months ago
A good first step might be to simply make parquet optional in DataFusion -- aka https://github.com/apache/arrow-datafusion/issues/7653
That would allow us to validate and explore what dependencies are blocking wasm compilation
https://github.com/apache/arrow-rs/pull/4884 makes parquet compile for WASM
Also, https://github.com/apache/arrow-datafusion/pull/7745 make parquet support optional in DataFusion
I managed to compile for wasm, but I encountered a couple of problems:
SessionContext::new
std::time::Instant
- this won't compile and probably needs to be hidden behind cfg
https://github.com/apache/arrow-datafusion/compare/main...fudini:arrow-datafusion:wasmAfter these changes I was able to create SessionContext
and run a simple query
Is your feature request related to a problem or challenge?
As shown by @jonmmease in https://github.com/apache/arrow-datafusion/pull/7633, some of the datafusion crates can be compiled to WASM:
The difficulty with getting the remaining DataFusion crates compiled to WASM is that they have non-optional dependencies on the
parquet
crate with its default features enabled. Several of the default parquet crate features require native dependencies that are not compatible with WASM, in particular thelz4
andzstd
features. If we can arrange our feature flags to make it possible to depend on parquet with these features disabled, then it should be possible to compile the coredatafusion
crate to WASM as well.Describe the solution you'd like
One approach might be to disable the relevant parquet features that could not be compiled as described below.
From https://github.com/apache/arrow-datafusion/pull/7633/files#r1335824930 between @jonmmease and @tustvold
Describe alternatives you've considered
No response
Additional context
No response