internetarchive / warcprox

WARC writing MITM HTTP/S proxy
371 stars 54 forks source link

TypeError("connection_from_host() got an unexpected keyword argument 'pool_kwargs'" #148

Closed qome closed 4 years ago

qome commented 4 years ago

I am unable to diagnose this error.

$ warcprox --dir . --gzip --rollover-idle-time 30
http_proxy=127.0.0.1:8000 https_proxy=127.0.0.1:8000 wget --delete-after \
  --execute robots=off \
  --no-check-certificate \
  --no-directories \
  --page-requisites \
  --span-hosts \
  --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" \
"https://example.com"
2020-01-28 14:42:41,989 2023 INFO MainThread warcprox.warcproxy.WarcProxy.__init__(mitmproxy.py:625) 100 proxy threads
2020-01-28 14:42:41,991 2023 NOTICE MainThread warcprox.warcproxy.WarcProxy.server_activate(warcproxy.py:495) listening on 127.0.0.1:8000
2020-01-28 14:42:41,993 2023 INFO DedupLoader(tid=2028) warcprox.BasePostfetchProcessor._run(__init__.py:134) <DedupLoader(DedupLoader(tid=2028), started 140167794534144)> starting up
2020-01-28 14:42:41,994 2023 INFO MainThread warcprox.dedup.DedupDb.start(dedup.py:95) opening existing deduplication database ./warcprox.sqlite
2020-01-28 14:42:41,996 2023 INFO WarcWriterProcessor(tid=2029) warcprox.writerthread.WarcWriterProcessor._run(__init__.py:134) <WarcWriterProcessor(WarcWriterProcessor(tid=2029), started 140167786141440)> starting up
2020-01-28 14:42:41,997 2023 INFO DedupDb(tid=2030) warcprox.BasePostfetchProcessor._run(__init__.py:134) <ListenerPostfetchProcessor(DedupDb(tid=2030), started 140167777748736)> starting up
2020-01-28 14:42:41,997 2023 INFO StatsProcessor(tid=2031) warcprox.stats.StatsProcessor._run(__init__.py:134) <StatsProcessor(StatsProcessor(tid=2031), started 140167769356032)> starting up
2020-01-28 14:42:41,998 2023 INFO StatsProcessor(tid=2031) warcprox.stats.StatsProcessor._startup(stats.py:110) opening existing stats database ./warcprox.sqlite
2020-01-28 14:42:41,998 2023 INFO StatsProcessor(tid=2031) warcprox.stats.StatsProcessor._startup(stats.py:127) created table buckets_of_stats in ./warcprox.sqlite
2020-01-28 14:42:41,999 2023 INFO RunningStats(tid=2032) warcprox.BasePostfetchProcessor._run(__init__.py:134) <ListenerPostfetchProcessor(RunningStats(tid=2032), started 140167760963328)> starting up
2020-01-28 14:42:44,851 2023 ERROR MitmProxyHandler(tid=2037,started=2020-01-28T20:42:44.834551,client=127.0.0.1:49100) warcprox.warcprox.WarcProxyHandler.do_COMMAND(mitmproxy.py:442) problem processing request 'GET / HTTP/1.1': TypeError("connection_from_host() got an unexpected keyword argument 'pool_kwargs'",)
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/warcprox/mitmproxy.py", line 413, in do_COMMAND
    self._connect_to_remote_server()
  File "/usr/local/lib/python3.5/dist-packages/warcprox/warcproxy.py", line 189, in _connect_to_remote_server
    return warcprox.mitmproxy.MitmProxyHandler._connect_to_remote_server(self)
  File "/usr/local/lib/python3.5/dist-packages/warcprox/mitmproxy.py", line 277, in _connect_to_remote_server
    pool_kwargs={'maxsize': 12, 'timeout': self._socket_timeout})
TypeError: connection_from_host() got an unexpected keyword argument 'pool_kwargs'
2020-01-28 14:42:44,853 2023 WARNING MitmProxyHandler(tid=2037,started=2020-01-28T20:42:44.834551,client=127.0.0.1:49100) warcprox.warcprox.WarcProxyHandler.log_error(mitmproxy.py:616) code 500, message Internal Server Error
nlevitt commented 4 years ago

What version of urllib3 do you have there?

Looks like pool_kwargs was added to the function signature somewhere around 1.22. warcprox/setup.py is only calling for urllib3>=1.14 so we probably need to update that. It's a little odd that some old version would have been installed in your environment though 🤷🏻‍♂️

qome commented 4 years ago

For some reason I have urllib3 1.19.1 installed in the system directories and an upgrade resolved the issue.