internetarchive / warcprox

WARC writing MITM HTTP/S proxy
378 stars 54 forks source link

Increase urllib parse cache size #129

Closed vbanos closed 5 years ago

vbanos commented 5 years ago

In python2/3, urllib parse caches in memory URL parsing results to avoid repeating the process for the same URL. The problem is that the default in memory cache size is just 20. https://github.com/python/cpython/blob/3.7/Lib/urllib/parse.py#L80

Since we do a lot of URL parsing, it makes sense to increase cache size.

vbanos commented 5 years ago

The WBM uses the same setting as we need to do a lot of URL parsing there as well.