replicate / cog

Containers for machine learning
https://cog.run
Apache License 2.0
8.12k stars 563 forks source link

Use urllib3.PoolManager to handle URLFile downloads #2015

Closed aron closed 1 week ago

aron commented 1 month ago

This PR introduces a shared urllib3.PoolManager as a singleton instance on the URLFile class which will share connections across downloads. This should improve performance in situations where downloads have a high latency due to establishing new connections.

It's not entirely clear how this will behave when pickling/unpickling the URLFile to pass to the worker. Some testing will be required before merging the change.

Longer term we have discussed alternative approaches that would move the networking code off the GPU.

[!NOTE] The code on the main branch uses requests but it's possible that we'll need to update the code there to use a session to achieve the same benefits.