python-hyper / h11

A pure-Python, bring-your-own-I/O implementation of HTTP/1.1
https://h11.readthedocs.io/
MIT License
490 stars 62 forks source link

RemoteProtocolError: Illegal header line in response #130

Open yshen-ya opened 3 years ago

yshen-ya commented 3 years ago

The server I was trying to reach return every response with this line in header which raise RemoteProtocolError. I tried to forge a response with the header line, showing it is valid. Is there any way I can get the response pass with h11 header validator? image

Traceback (most recent call last):
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpx\_client.py", line 1502, in _send_single_request
    (status_code, headers, stream, ext,) = await transport.arequest(
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpcore\_async\connection_pool.py", line 218, in arequest
    response = await connection.arequest(
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpcore\_async\connection.py", line 106, in arequest
    return await self.connection.arequest(method, url, headers, stream, ext)
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpcore\_async\http11.py", line 72, in arequest
    ) = await self._receive_response(timeout)
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpcore\_async\http11.py", line 133, in _receive_response
    event = await self._receive_event(timeout)
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpcore\_async\http11.py", line 169, in _receive_event
    event = self.h11_state.next_event()
  File "c:\users\yshen\scoop\apps\python\3.9.0\lib\contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "C:\Users\yshen\AppData\Local\pypoetry\Cache\virtualenvs\magic-WQZay8yd-py3.9\lib\site-packages\httpcore\_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc) from None
httpcore.RemoteProtocolError: illegal header line: bytearray(b"Content-Security-Policy : default-src \'self\';script-src https: \'unsafe-eval\' \'unsafe-inline\'; style-src https: \'unsafe-inline\';img-src * \'self\' data: https")
njsmith commented 3 years ago

Oof, this is a tough one. That header line is definitely invalid, according to the spec. In fact, RFC 7230 has some special text about exactly this case:

No whitespace is allowed between the header field-name and colon. In the past, differences in the handling of such whitespace have led to security vulnerabilities in request routing and response handling. A server MUST reject any received request message that contains whitespace between a header field-name and colon with a response code of 400 (Bad Request). A proxy MUST remove any such whitespace from a response message before forwarding the message downstream.

This language is extremely unusual: in general, RFC 7230 allows a lot of leniency for implementations to accept not-quite-correct data, if they want to. This might be the only situation where it says you MUST reject a malformed message. So I think we should be very cautious about accepting this particular error.

OTOH, you're talking about a client, while the RFC is only SHOUTY about servers. And from your screenshot, it looks like at least one browser accepts this. And sometimes we have to accept the same stuff the browsers accept and let them lead the way first. So... maybe possibly we should accept this, but only when in client mode? Though technically that's a bit tricky, because currently our header parsing code is shared between client and server mode.

Which browser is that? Have you tested any other browsers?

CC @tomchristie

yshen-ya commented 3 years ago

It was Chrome, and I tested Firefox, worked fine

Congee commented 3 years ago

That seems a tough decision to be lenient or not.

Our workaround currently is monkey-patching the validator function to discard invalid headers. I understand that in h11, a regex is used to parse and validate headers. It not easy to just disable validation.

Though, it would be nice if there is a way disable inbound header validation like in h2 https://github.com/python-hyper/h2/blob/94b89b88edd587abb3ee0e72f1c961a5be7e132d/src/h2/config.py#L81.