gradio-app / gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
http://www.gradio.app
Apache License 2.0
32.5k stars 2.44k forks source link

gradio client httpx session to reduce TIME_WAIT tcp connections #7890

Open dodysw3 opened 6 months ago

dodysw3 commented 6 months ago

Is your feature request related to a problem? Please describe.
Calling predict() of simple task but iterating on a large number of inputs generates large number of TIME_WAIT connections.

Describe the solution you'd like
Have an option for predict() to maintain a single httpx session, or takes a httpx client instance from parameter.

Additional context
In some public cloud infrastructure, outgoing connection to public internet is done through shared NATed gateway and the number of unique connections are very limited. TIME_WAIT in Linux takes 30s to clear. When reaching limit, predict() hung as new connection waits for released TIME_WAIT connections.

dodysw3 commented 6 months ago

This is what I came with https://github.com/gradio-app/gradio/compare/main...dodysw3:gradio:http2client?expand=1 which works for my use case; it reuses the httpx client across multiple predict invocations and opportunistically use http2 if gradio server behind h2 reverse proxy -- which further reduce the number of connection needed;

I don't know if this useful for others.

freddyaboulton commented 6 months ago

Hi @dodysw3 , your changes make sense. If you open a PR one of us can review soon. One comment though is that I would not enable http2 connection by default as that adds a new requirement to the gradio client and the http/2 implementation in httpx is not as mature as http/1.

Instead I think we can first try to instantiate a client with http2 support, if that raises an import error, enable the default http/1 client.