fix: pass pool_timer to hyper_util to enable the idle cleanup task

RobMor commented 2 months ago

While testing an application I'm working on that makes requests to many different endpoints I found that reqwest wasn't cleaning up pooled connections even after the pool_idle_timeout.

After investigating it seems like the hyper_util connection pool is bailing on this condition: https://github.com/hyperium/hyper-util/blob/master/src/client/legacy/pool.rs#L429-L433. This seems to be because reqwest doesn't pass a pool_timer.

Reproduction:

server.rs:

use warp::Filter;

#[tokio::main]
async fn main() {
    simple_logger::init_with_level(log::Level::Info).unwrap();

    let basic = warp::filters::path::end()
        .map(|| "hello");

    let server = warp::serve(basic)
        .run(([127, 0, 0, 1], 8000));

    server.await;
}

client.rs:

#[tokio::main]
async fn main() {
    simple_logger::init_with_level(log::Level::Trace).unwrap();

    let client = reqwest::Client::builder()
        .pool_idle_timeout(std::time::Duration::from_secs(5))
        .build()
        .unwrap();

    client.get("http://localhost:8000").send().await.unwrap();

    tokio::time::sleep(std::time::Duration::from_secs(10)).await;

    client.get("http://localhost:8000").send().await.unwrap();
}

Output, using master:

2024-09-26T08:27:19.715Z DEBUG [reqwest::connect] starting new connection: http://localhost:8000/
...
2024-09-26T08:27:29.719Z TRACE [hyper_util::client::legacy::pool] removing expired connection for ("http", localhost:8000)
...
2024-09-26T08:27:29.719Z DEBUG [reqwest::connect] starting new connection: http://localhost:8000/

(connection is only cleaned up when creating a new connection)

Output, after this commit:

2024-09-26T08:26:10.285Z DEBUG [reqwest::connect] starting new connection: http://localhost:8000/
...
2024-09-26T08:26:15.289Z TRACE [hyper_util::client::legacy::pool] idle interval evicting expired for ("http", localhost:8000)
...
2024-09-26T08:26:20.289Z DEBUG [reqwest::connect] starting new connection: http://localhost:8000/

(connection is actively cleaned up after the idle timeout by the IdleTask)

This is important for applications that make requests to many different endpoints. If these old connections don't get cleaned up we run into memory and file descriptor issues.

RobMor commented 2 months ago

I'm not an expert in the history here but it seems like this was just missed during the hyper v1 upgrade. It looks like at that time the separate pool_timer builder option was already in place.

Any ideas on how one could add some kind of automated test case around this behavior? Maybe I could add a test that spins up a server, makes a request using a client with a low idle timeout, and asserts that after that idle timeout the connection was closed from the servers perspective.

RobMor commented 2 months ago

Also, was wrapping both in the wasm32 check the right thing to do? Not really sure what that was there for.

seanmonstar commented 1 month ago

Oh wow, thanks for fixing this!

Any ideas on how one could add some kind of automated test case around this behavior?

Coming up with an integration test in reqwest is likely hard, since there's no way for you access the details outside the library. However, I'm thinking the mistake is in hyper-util: it should notice when the timeouts are set but not timer is set (hyper does the same for its timeouts).

Also, was wrapping both in the wasm32 check the right thing to do?

Um, that seems like it previously wasn't doing anything. None of that code is ever configured on for WASM, so that extra cfg doesn't matter. We can probably just remove it!

RobMor commented 1 month ago

there's no way for you access the details outside the library

I was thinking maybe we could spin up a dummy server for the test. That would give us access to the server side connection state where we should be able to see that the connection gets closed after the idle timeout.

I'm thinking the mistake is in hyper-util: it should notice when the timeouts are set but not timer is set

Curious on how one would implement a check like this? It seems like it would be a breaking change to start panicking or erroring on previously "working" code.

We can probably just remove it!

Will do

seanmonstar commented 1 month ago

Curious on how one would implement a check like this? It seems like it would be a breaking change to start panicking or erroring on previously "working" code.

Yea... hyper starting at 1.0 would cause a panic if a timer is missing but timeouts were configured. It'd be unfortunate to suddenly panic in a patch version of hyper-util. Sigh, I'm bummed that I messed up the timer design, but oh well...

RobMor commented 1 month ago

I've added a test! I confirmed that this test fails without this change, and passes with it...

Let me know what you think of it.

RobMor commented 1 month ago

Realized I didn't need to use tokio's unbounded_channel, and also had to fix some formatting. Should be good now...

RobMor commented 1 month ago

Agh, tests should be passing with the http3 feature now too.

RobMor commented 1 month ago

Thanks for merging! When do you expect the next release will happen?

seanmonstar commented 1 month ago

Published v0.12.8 just now :)

seanmonstar / reqwest

fix: pass pool_timer to hyper_util to enable the idle cleanup task #2434