apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.32k stars 682 forks source link

reqwest middleware support for object_store #5990

Open Conni2461 opened 2 days ago

Conni2461 commented 2 days ago

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

We could instrument all requests made with object with otlp trace propagation(https://www.w3.org/TR/trace-context/#trace-context-http-headers-format) and for that we would need to be able to inject headers on each request with a different headervalue.

Using https://docs.rs/object_store/latest/object_store/struct.ClientOptions.html#method.with_default_headers doesnt really solve this issue for us because that allows us to set static custom headers but for tracing the injected headervalues need to be dynamic (each request has its own parent trace).

Describe the solution you'd like

Currently, there is no native support for this in reqwest (https://github.com/seanmonstar/reqwest/issues/155) and for the time being we need to look into alternative solutions.

https://github.com/TrueLayer/reqwest-middleware is one of those solutions that allows us to register custom middlewares. So the idea is that object_store could use this crate internally (and wrap the currently used reqwest client with their client wrapper) and provide an interface to pass in new middlewares, which we could then use to provide a trace propagation middleware and other users their own custom once

Describe alternatives you've considered

Additional context

tustvold commented 2 days ago

I have wondered in the past about decoupling object_store from reqwest to allow users to provide their own HTTP implementations. This would also potentially have benefits for WASM support, as reqwest's WASM support is a tad incoherent (through no real fault of its own).

I don't hate the idea of adding a middleware abstraction, but I would prefer to stick with first-party abstractions where possible. In fact we avoid exposing reqwest in the public APIs, let alone a third-party extension for it...

Taking a step back, perhaps we could tackle the desired use-case, namely distributed tracing, as opposed to your chosen solution to this problem? How do you envisage trace context propagating through to the HTTP requests from the caller?