pfnet / pfio

IO library to access various filesystems with unified API
https://pfio.readthedocs.io/
MIT License
52 stars 20 forks source link

Faster HTTPConnector #317

Closed y1r closed 1 year ago

y1r commented 1 year ago

I profiled HTTPConnector's performance by using cProfile and snakeviz. I found that urllib3's HTTP request is not optimized for sending requests to the same host.

Previous:

image

The given url is parsed many times to get appropriate connection from the connection pool. The url parse is actually just execution of regexp, but from the perspective of small file use case (latency-neck), it should not be happened many times.

This PR:

image

I re-implemented connection pool for sending requests to the same host. By this, we can see the performance improvement (we're using more time in recvinto relatively, so it means the latency is reduced). Also, we don't have to use urllib3 anymore.

Note: I don't implement clean up of connection pool, but if user don't use many hosts to request, it will not be a problem.