Open kylebarron opened 2 months ago
I created a very very minimalistic node app to test geoarrow-rs
:
https://github.com/Bonkles/parquet-wasm-test
NOTE: The latest released version of geoarrow-rs is not going to generate the header issue- you'll need to rebuild it from the latest commit and adjust this app's pointer via npm link or npm install {pathspec}/geoarrow-rs/js/pkg
.
I'm currently using commit 53ce09d64f76b2609961dc15604d62ff1ed6f921
from the geoarrow-rs repo, which is the latest as of right now.
With that, this example app emits the following errror stacktrace:
/Users/clarkben/working/OvertureMaps/geoarrow-rs/js/pkg/node/index.js:13106
const ret = new Error(getStringFromWasm0(arg0, arg1));
^
Error: Generic HTTP error: Content-Length Header missing from response
at module.exports.__wbindgen_error_new (/Users/clarkben/working/OvertureMaps/geoarrow-rs/js/pkg/node/index.js:13106:17)
at wasm_bindgen::JsError::new::hcd48a6f2c532488f (wasm://wasm/0c983d96:wasm-function[53955]:0x247abbe)
at <wasm_bindgen::JsError as core::convert::From<E>>::from::h33a64558a0f1db53 (wasm://wasm/0c983d96:wasm-function[43885]:0x233dd45)
at <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual::h0d176b8409bfaffd (wasm://wasm/0c983d96:wasm-function[29027]:0x20492ca)
at geoarrow_wasm::io::parquet::async::ParquetDataset::new::{{closure}}::h0d2f272447af6032 (wasm://wasm/0c983d96:wasm-function[859]:0xb78d05)
at geoarrow_wasm::io::parquet::async::ParquetDataset::new::{{closure}}::_::__wasm_bindgen_generated_ParquetDataset_new::{{closure}}::ha6cf0b999ecc17aa (wasm://wasm/0c983d96:wasm-function[6442]:0x15320da)
at wasm_bindgen_futures::future_to_promise::{{closure}}::{{closure}}::h64f174e50af6d802 (wasm://wasm/0c983d96:wasm-function[6307]:0x150f0c3)
at wasm_bindgen_futures::task::singlethread::Task::run::ha50ba02357fb29a7 (wasm://wasm/0c983d96:wasm-function[15290]:0x1b48fb1)
at wasm_bindgen_futures::queue::QueueState::run_all::h64c118d7475d5617 (wasm://wasm/0c983d96:wasm-function[10801]:0x18a4861)
at wasm_bindgen_futures::queue::Queue::new::{{closure}}::hab071bf127ce737f (wasm://wasm/0c983d96:wasm-function[49097]:0x23ef35c)
This is with Node 20.12.0, but we observed the same type of error when running a very similar test from a browser environment.
Ah - running this in the browser, with minor CJS -> ESM changes, the error's much, much clearer, the HEAD request gets a 404. 404 responses don't include Content-Length, which this is because the url's being reconstructed as <bunch-of-preamble>/type=building/
+ /part...zstd.parquet
- the double /
.
The odd thing is, there's references on my end to surfacing StatusCode::NotFound responses as errors, it's quite possible reqwest's retry mechanism is swallowing those errors.
I'll get the unsurfaced errors fixed, then move onto fixing the url construction (which should be fusing the path separator, a bit like Path('foo/')
/ /bar
in python resolves to Path('foo/bar')
).
Done, I've got a few things relating to suffix requests that I'll leave for a bit (not just the two corrections), but an 0.0.5 version bump would be fine at this point.
Awesome! So it looks like this should be fixed?
In https://github.com/geoarrow/geoarrow-rs/pull/701 I bumped to 0.0.5. @Bonkles is out on vacation this week; I'll check that everything's fixed when he's back! Thanks! 🙏
I was debugging some overture GeoParquet data fetching with @Bonkles and we were running into some exceptions saying
This seems to be coming from here https://github.com/H-Plus-Time/object-store-wasm/blob/8d825bbd6d3a47041cf2da80ce91b52eca7972ab/src/http/mod.rs#L52-L53
and in turn here https://github.com/H-Plus-Time/object-store-wasm/blob/8d825bbd6d3a47041cf2da80ce91b52eca7972ab/src/http/mod.rs#L95-L97
and in turn here https://github.com/H-Plus-Time/object-store-wasm/blob/8d825bbd6d3a47041cf2da80ce91b52eca7972ab/src/http/mod.rs#L227-L233
This is coming from this line of code in geoarrow-rs https://github.com/geoarrow/geoarrow-rs/blob/2a7f150fe7f1502987f191370730770883af33db/js/src/io/parquet/async.rs#L178-L182
So it seems that HEAD requests are somehow missing some headers?