python-hyper / wsproto

Sans-IO WebSocket protocol implementation
https://wsproto.readthedocs.io/
MIT License
261 stars 38 forks source link

Current implementation DOES NOT COMPY TO RFC6455 #175

Closed racinette closed 2 years ago

racinette commented 2 years ago

RFC6455 writes about the Sec-WebSocket-Protocol header:

The |Sec-WebSocket-Protocol| header field MAY appear multiple times in an HTTP request (which is logically the same as a single |Sec-WebSocket-Protocol| header field that contains all values).

And also the same it says about the Sec-WebSocket-Extensions header:

The |Sec-WebSocket-Extensions| header field MAY appear multiple times in an HTTP request (which is logically the same as a single |Sec-WebSocket-Extensions| header field that contains all values.

But the implementations only catches the last header of the handshake request for both of these headers.

Proof:

import base64
import os

from wsproto.connection import ConnectionType
from wsproto import WSConnection

sec_websocket_key = base64.b64encode(os.urandom(16))
http11_handshake = \
    b"GET wss://api.website.xyz/ws HTTP/1.1\r\n" \
    b"Host: api.website.xyz\r\n" \
    b"Connection: Upgrade\r\n" \
    b"Pragma: no-cache\r\n" \
    b"Cache-Control: no-cache\r\n" \
    b"User-Agent: Mozilla/5.0 (X11; Linux x86_64) " \
    b"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36\r\n" \
    b"Upgrade: websocket\r\n" \
    b"Origin: https://website.xyz\r\n" \
    b"Sec-WebSocket-Version: 13\r\n" \
    b"Accept-Encoding: gzip, deflate, br\r\n" \
    b"Accept-Language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7\r\n" \
    b"Sec-WebSocket-Key: " + sec_websocket_key + b"\r\n" \
    b"Sec-WebSocket-Extensions: this-extension; isnt-seen, even-tho, it-should-be\r\n" \
    b"Sec-WebSocket-Protocol: there-protocols\r\n" \
    b"Sec-WebSocket-Protocol: arent-seen\r\n" \
    b"Sec-WebSocket-Extensions: this-extension; were-gonna-see, and-another-extension; were-also; gonna-see=100; percent\r\n" \
    b"Sec-WebSocket-Protocol: only-these-protocols, are-seen, from-the-request-object\r\n" \
    b"\r\n"

server_conn = WSConnection(ConnectionType.SERVER)
server_conn.receive_data(http11_handshake)
for event in server_conn.events():
    print(type(event), event)

Prints:

<class 'wsproto.events.Request'> Request(host='api.website.xyz', target='wss://api.website.xyz/ws', extensions=['this-extension; were-gonna-see', 'and-another-extension; were-also; gonna-see=100; percent'], extra_headers=[(b'connection', b'Upgrade'), (b'pragma', b'no-cache'), (b'cache-control', b'no-cache'), (b'user-agent', b'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36'), (b'upgrade', b'websocket'), (b'origin', b'https://website.xyz'), (b'sec-websocket-version', b'13'), (b'accept-encoding', b'gzip, deflate, br'), (b'accept-language', b'ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7'), (b'sec-websocket-key', b'qEGbHKjRLYJJHJB4V1e1bA==')], subprotocols=['only-these-protocols', 'are-seen', 'from-the-request-object'])

The problem lies within the _process_connection_request function located in the wsproto/handshake.py file. It's easy to spot, that it in fact actually replaces the old occurrences of the named headers with the new ones:

...
            elif name == b"sec-websocket-extensions":
                extensions = split_comma_header(value)
                continue  # Skip appending to headers
            elif name == b"sec-websocket-key":
                key = value
            elif name == b"sec-websocket-protocol":
                subprotocols = split_comma_header(value)
...
pgjones commented 2 years ago

Fixed by 7d50c0788465f4fc4180c94c2cdfcd3cd6d9f4ec