Open kylebarron opened 1 year ago
Yep, this is a good way the way to do things, though I'd add a slight proviso to the drawbacks:
I'd probably also do geoarrow-wasm-slim and geoarrow-wasm-full in the one npm package and manage stuff via exports (~40MB + 2x geoarrow-wasm's size - I reckon all the packages that produce {node,bundler,esm,esm2} bundles can be slimmed down by 25-50% with one straightforward and one less straightforward tweak).
- It isn't possible unless using wrapper structs
What I was hoping was that I could re-export all the existing struct's methods, and just add one new method. That doesn't really seem possible without manually wrapping the wrapped-struct's methods one by one?
- Provided there isn't some awful quirk in depending crates, it should at least be relatively boilerplate. The difficulty would probably come from complicated conditional flag combinations (maybe once you're at the level of geopolars or js-polars, the impact of, say, individua compressions at the arrow-wasm/parquet-wasm level are too small to bother with flags other than 'all compressions').
Yeah I agree. It should be straightforward, just annoying.
I reckon all the packages that produce {node,bundler,esm,esm2} bundles can be slimmed down by 25-50% with one straightforward and one less straightforward tweak).
If you have packaging recommendations I'm all ears 🙂 . esm2
was a "temporary" hack to get esm
working in deck.gl I believe, or something like that. Because the esm
export used syntax only available in some specific environment.
Problem statement
The biggest hurdle with WebAssembly in the browser is that multiple Wasm modules can't share the same memory space. This means that having e.g.
parquet-wasm
andgeoarrow-wasm
as two separate NPM modules is annoying! You have to useparquet-wasm
to load parquet into Arrow in Wasm... but then copy the data to JS, and then copy it into the next wasm module to do more processing with it! This is slow, memory intensive, and not user friendly.Solution
In https://github.com/domoritz/arrow-wasm, Dominik's goal appeared to be to see if Arrow in rust/wasm would be faster than Arrow in JS. But since working with raw buffers is pretty fast in JS, it's not surprising that Wasm overhead would outweigh any other speedups.
I think the potential of arrow-wasm instead is in being a foundational library for other wasm-bindgen libraries.
So I see various potential libraries:
geoparquet-wasm
. Used by consumers who want to fetch geoparquet to geoarrow but also do some geospatial operations.Other libraries for other formats might make sense to add in the future. like
geoarrow-flatgeobuf
, which uses rust to parse flatgeobuf into geoarrow. Etc.Drawbacks
arrow-wasm
in a functional manner.geoarrow-wasm
might have feature flags for each compression in parquet-wasm?cc @H-Plus-Time