python-trio / trio-websocket

WebSocket client and server implementation for Python Trio
MIT License
70 stars 25 forks source link

Fix " Illegal target characters" when connecting to URL with zero-length path #149

Closed bluetech closed 3 years ago

bluetech commented 3 years ago

Fixes #148.

This regressed in 0233e2176731a190a515c025c0762c19f (version 0.9.0) due to different behavior between yarl and urllib.parse.urlsplit on a URL without a path component like "wss://ws.kraken.com". yarl makes this / while urlsplit makes it ''. h11 expects / as specified in the quoted RFC.

coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 149


Totals Coverage Status
Change from base Build 147: 0.02%
Covered Lines: 495
Relevant Lines: 517

💛 - Coveralls
coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 148


Totals Coverage Status
Change from base Build 147: 0.02%
Covered Lines: 496
Relevant Lines: 518

💛 - Coveralls
belm0 commented 3 years ago

Thank you

Before the parsing change, was WebSocketConnection.path itself set to / in the open_websocket_url('wss://ws.example.com') case?

I'm wondering if it should consider to do so and be documented somewhere in the API.

bluetech commented 3 years ago

Before the parsing change, was WebSocketConnection.path itself set to / in the open_websocket_url('wss://ws.example.com') case?

Yes.

I'm wondering if it should consider to do so and be documented somewhere in the API.

At the URL level the path is empty, it just needs to be translated to / for the HTTP layer. On the other hand, on the server side it will always be / because that's how it's received and there is no way to distinguish the two (when using the request-target origin-form at least).

My inclination is to leave it as in the current state of the PR, but I don't feel strongly and can change it if you prefer.

belm0 commented 3 years ago

Before the parsing change, was WebSocketConnection.path itself set to / in the open_websocket_url('wss://ws.example.com') case?

Yes.

I'm wondering if it should consider to do so and be documented somewhere in the API.

At the URL level the path is empty, it just needs to be translated to / for the HTTP layer. On the other hand, on the server side it will always be / because that's how it's received and there is no way to distinguish the two (when using the request-target origin-form at least).

My inclination is to leave it as in the current state of the PR, but I don't feel strongly and can change it if you prefer.

I don't know. If it's truly arbitrary, I'd lean a little towards the implementation's previous behavior?

I took a look at httpx out of curiosity:

>>> r = httpx.get('https://www.example.org')
>>> r.url.path
'/'
bluetech commented 3 years ago

OK, updated so that the path attribute is / as well.