hyperium / hyper

An HTTP library for Rust
https://hyper.rs
MIT License
14.58k stars 1.6k forks source link

hyper-util: expose connection info on whether it's reused from the pool or newly established #3727

Open magurotuna opened 3 months ago

magurotuna commented 3 months ago

Is your feature request related to a problem? Please describe.

While investigating hard-to-debug connection related issues in cloud infra, we will likely want to obtain any kind of information that might be relevant. Even whether a connection that has just failed is reused from the connection pool or is newly established can be a valuable insight that we could use to troubleshoot.

Describe the solution you'd like

Maybe add a new private field to the hyper_util::client::legacy::connect::HttpInfo, like is_new_connection (or is_reused_connection) and make it publicly accessible via a method?

Describe alternatives you've considered

I didn't come up with other alternatives

Additional context

N/A

seanmonstar commented 3 months ago

Since the HttpConnector and the Pool are separate, perhaps it could be a separate extension. That way it still works if using a different initial connector.

magurotuna commented 3 months ago

That makes total sense.

What we are specifically interested in is to expose the "reused or not" information only on error cases, so I thought that we could add is_reused boolean inside hyper_util::client::legacy::Error. Then it felt to me that the most straightforward way to do so would be to add is_reused field in hyper_util::client::legacy::connect::Connected because it is exposed from the Error type and this provides connection-related information, but on second thought, what this Connected type has is more like protocol-related information (except the Connected::poison method that turns on the conneciton pool related flag), so it might be not a perfect fit to have is_reused flag in Connected since connection pooling is not necessarily related to protocols. Another point that made me think that having is_reused in Connected might not be good is that, Pooled::is_reused method provides a way to determine if the pooled object is reused or not, but at the same time PoolClient, which is wrapped in Pooled, has Connected field meaning that if we had is_reused in Connected, that would end up with duplicated is_reused in Connected and Pooled. This could be a code smell.

Given these things, now I think the interface as follows would be good:

pub struct Error {
    kind: ErrorKind,
    source: Option<Box<dyn StdError + Send + Sync>>,
    #[cfg(any(feature = "http1", feature = "http2"))]
-   connect_info: Option<Connected>,
+   connect_info: Option<ConnectInfo>,
}

+ /// Instead of adding `is_reused` to `Connected`, we define a new struct here
+ /// to hold a raw `Connected`, which has protocol-related information, and
+ /// `is_reused` flag, which is connection pool related information.
+ pub struct ConnectInfo {
+     connected: Connected,
+     is_reused: bool,
+ }
+ 
+ /// Instead of exposing all the methods of `Connected`, we expose getter methods
+ /// that allow callers to get necessary information about the failed connection.
+ impl ConnectInfo {
+     pub fn is_proxied(&self) -> bool {
+         self.connected.is_proxied()
+     }
+     
+     pub fn get_extras(&self, extensions: &mut Extensions) {
+         self.connected.get_extras(extensions);
+     }
+     
+     pub fn is_negotiated_h2(&self) -> bool {
+         self.connected.is_negotiated_h2()
+     }
+     
+     pub fn is_reused(&self) -> bool {
+         self.is_reused
+     }
+}

What do you think about this approach? The concern I have with this is that this will be a breaking change as the return type of Error::connect_info will be changed. Not sure if this is acceptable or not though.

magurotuna commented 3 months ago

I implemented based on my proposed approach in https://github.com/hyperium/hyper-util/pull/145. Any thoughts or feedback would be appreciated.