H-Plus-Time / object-store-wasm

Apache License 2.0
2 stars 1 forks source link

Content-Length Header missing from response #42

Open kylebarron opened 1 month ago

kylebarron commented 1 month ago

I was debugging some overture GeoParquet data fetching with @Bonkles and we were running into some exceptions saying

Content-Length Header missing from response

This seems to be coming from here https://github.com/H-Plus-Time/object-store-wasm/blob/8d825bbd6d3a47041cf2da80ce91b52eca7972ab/src/http/mod.rs#L52-L53

and in turn here https://github.com/H-Plus-Time/object-store-wasm/blob/8d825bbd6d3a47041cf2da80ce91b52eca7972ab/src/http/mod.rs#L95-L97

and in turn here https://github.com/H-Plus-Time/object-store-wasm/blob/8d825bbd6d3a47041cf2da80ce91b52eca7972ab/src/http/mod.rs#L227-L233

This is coming from this line of code in geoarrow-rs https://github.com/geoarrow/geoarrow-rs/blob/2a7f150fe7f1502987f191370730770883af33db/js/src/io/parquet/async.rs#L178-L182

So it seems that HEAD requests are somehow missing some headers?

Bonkles commented 1 month ago

I created a very very minimalistic node app to test geoarrow-rs:

https://github.com/Bonkles/parquet-wasm-test

NOTE: The latest released version of geoarrow-rs is not going to generate the header issue- you'll need to rebuild it from the latest commit and adjust this app's pointer via npm link or npm install {pathspec}/geoarrow-rs/js/pkg.

I'm currently using commit 53ce09d64f76b2609961dc15604d62ff1ed6f921 from the geoarrow-rs repo, which is the latest as of right now.

With that, this example app emits the following errror stacktrace:

/Users/clarkben/working/OvertureMaps/geoarrow-rs/js/pkg/node/index.js:13106
    const ret = new Error(getStringFromWasm0(arg0, arg1));
                ^

Error: Generic HTTP error: Content-Length Header missing from response
    at module.exports.__wbindgen_error_new (/Users/clarkben/working/OvertureMaps/geoarrow-rs/js/pkg/node/index.js:13106:17)
    at wasm_bindgen::JsError::new::hcd48a6f2c532488f (wasm://wasm/0c983d96:wasm-function[53955]:0x247abbe)
    at <wasm_bindgen::JsError as core::convert::From<E>>::from::h33a64558a0f1db53 (wasm://wasm/0c983d96:wasm-function[43885]:0x233dd45)
    at <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual::h0d176b8409bfaffd (wasm://wasm/0c983d96:wasm-function[29027]:0x20492ca)
    at geoarrow_wasm::io::parquet::async::ParquetDataset::new::{{closure}}::h0d2f272447af6032 (wasm://wasm/0c983d96:wasm-function[859]:0xb78d05)
    at geoarrow_wasm::io::parquet::async::ParquetDataset::new::{{closure}}::_::__wasm_bindgen_generated_ParquetDataset_new::{{closure}}::ha6cf0b999ecc17aa (wasm://wasm/0c983d96:wasm-function[6442]:0x15320da)
    at wasm_bindgen_futures::future_to_promise::{{closure}}::{{closure}}::h64f174e50af6d802 (wasm://wasm/0c983d96:wasm-function[6307]:0x150f0c3)
    at wasm_bindgen_futures::task::singlethread::Task::run::ha50ba02357fb29a7 (wasm://wasm/0c983d96:wasm-function[15290]:0x1b48fb1)
    at wasm_bindgen_futures::queue::QueueState::run_all::h64c118d7475d5617 (wasm://wasm/0c983d96:wasm-function[10801]:0x18a4861)
    at wasm_bindgen_futures::queue::Queue::new::{{closure}}::hab071bf127ce737f (wasm://wasm/0c983d96:wasm-function[49097]:0x23ef35c)
This is with Node 20.12.0, but we observed the same type of error when running a very similar test from a browser environment. 
H-Plus-Time commented 1 month ago

Ah - running this in the browser, with minor CJS -> ESM changes, the error's much, much clearer, the HEAD request gets a 404. 404 responses don't include Content-Length, which this is because the url's being reconstructed as <bunch-of-preamble>/type=building/ + /part...zstd.parquet - the double /.

H-Plus-Time commented 1 month ago

The odd thing is, there's references on my end to surfacing StatusCode::NotFound responses as errors, it's quite possible reqwest's retry mechanism is swallowing those errors.

I'll get the unsurfaced errors fixed, then move onto fixing the url construction (which should be fusing the path separator, a bit like Path('foo/') / /bar in python resolves to Path('foo/bar')).

H-Plus-Time commented 1 month ago

Done, I've got a few things relating to suffix requests that I'll leave for a bit (not just the two corrections), but an 0.0.5 version bump would be fine at this point.

kylebarron commented 1 month ago

Awesome! So it looks like this should be fixed?

kylebarron commented 1 month ago

In https://github.com/geoarrow/geoarrow-rs/pull/701 I bumped to 0.0.5. @Bonkles is out on vacation this week; I'll check that everything's fixed when he's back! Thanks! 🙏