internetarchive / warcprox

WARC writing MITM HTTP/S proxy
378 stars 54 forks source link

Catch BadStatusLine exception #134

Closed vbanos closed 5 years ago

vbanos commented 5 years ago

When trying to begin downloading from a remote host, we may get a RemoteDisconnected exception if it returns no data. We already handle that. We may also get BadStatusLine in case the response HTTP status is not fine. https://github.com/python/cpython/blob/3.7/Lib/http/client.py#L288

We should also add these cases in bad hosts cache.

vbanos commented 5 years ago

Example

BadStatusLine: ˆêunexpected reserved bits 4
  File "warcprox/mitmproxy.py", line 439, in do_COMMAND
    return self._proxy_request()
  File "warcprox/warcproxy.py", line 210, in _proxy_request
    self, extra_response_headers=extra_response_headers)
  File "warcprox/mitmproxy.py", line 479, in _proxy_request
    return self._inner_proxy_request(extra_response_headers)
  File "warcprox/mitmproxy.py", line 536, in _inner_proxy_request
    prox_rec_res.begin(extra_response_headers=extra_response_headers)
  File "warcprox/mitmproxy.py", line 187, in begin
    http_client.HTTPResponse.begin(self)  # reads status line, headers
  File "http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "http/client.py", line 279, in _read_status
    raise BadStatusLine(line)
nlevitt commented 5 years ago