Closed H-Plus-Time closed 2 months ago
The main question I had was around your experiments with ehttp (as a replacement for reqwest) - did you find it was worthwhile?
I did quite like the lower impedance model (that is, the very fetch-like behaviour), though the lack of header enums was a slight annoyance (the primary reason I've left the reqwest dep in on the object-store-wasm - I'm fairly certain the compiler should just leave in the enums, string conversion logic and strip everything else).
I had hoped that ehttp would be the solution to the streaming problem (having to fake it, that is), but the mpsc channel + spawn_local trick actually ended up being the solve. It looks like it shouldn't be a problem to do the exact same thing with reqwest (I suspect I just didn't see it at the time), so it's really just a question of ergonomics (for me, and any other maintainers), how much appetite there is for porting the changes over to the OG object-store.
Ah, also, given it does change the deps (hopefully not significantly), I'll get the delta generation working again in a separate PR (before this one's ready).
I ended up doing it as a straight swap, leaving the existing HttpFileReader code for posterity/easy switchback.
🙏 I'm happy to have both in the the codebase for a while as we figure out which design makes more sense.
The main question I had was around your experiments with ehttp (as a replacement for reqwest) - did you find it was worthwhile?
I think I just got to a point where I didn't fully understand Rust-async-streams or something like that, and there was existing code using reqwest that I could look at and figure out how to make work. So I don't think I got to the point where I formed a concrete opinion over whether to use ehttp or reqwest.
I'll get the delta generation working again in a separate PR (before this one's ready).
That would be awesome 🙏
This is great! I actually sat down last night trying to use parquet-wasm to make my own wasm object-store replacement for Datafusion... It's great to see someone else is already on it :partying_face:
Asset | Uncompressed Size | Compressed Size |
---|---|---|
async_full/parquet_wasm_bg.wasm | 5.45MB $\color{red}\textbf{+173KB +3\%}$ | 1.28MB $\color{red}\textbf{+63.2KB +5\%}$ |
slim/parquet_wasm_bg.wasm | 3.46MB $\color{red}\textbf{+3.92KB +0\%}$ | 544KB $\color{red}\textbf{+1.03KB +0\%}$ |
sync/parquet_wasm_bg.wasm | 4.74MB $\color{green}\textbf{-328B -0\%}$ | 1.04MB $\color{red}\textbf{+84B +0\%}$ |
How are you doing on this? Is this ready for a review?
Yeah, I've published the MVP version of the crate (didn't feel comfortable contributing a git-pinned dep), I think it's good to go.
Can you merge in main? Then I think we should be good to go
Amazing, thank you!
NB: Starting this a little earlier for design feedback on the object-store-wasm crate itself (primarily around ehttp vs reqwest).
The usage is very straightforward (given all the plumbing is present for ParquetObjectReader), I ended up doing it as a straight swap, leaving the existing HttpFileReader code for posterity/easy switchback.