Closed the-mikedavis closed 3 years ago
I suspect that there's a natural solution to both of these with the following changes...
Add an option to Mint.WebSocket.decode/2
(decode/3
) emit_fragments?: boolean()
, default false
which controls whether fragments are buffered in the %Mint.WebSocket{}
or emitted to the caller in some format of {:fragment, Mint.WebSocket.Frame.t()}
. The caller can then control the combination and error detection of fragments.
There is some prior art for this being a desire (albeit this issue pertains to a websocket server): https://github.com/ninenines/cowboy/issues/1106
Instead of flunking a whole batch of messages when an invalid frame is detected, we should emit an error tuple in the messages, e.g.
{:ok, websocket, [{:error, reason}]} = Mint.WebSocket.decode(websocket, data)
There's some prior art for this in how mint handles badly formed frames with Mint.HTTP.stream/2
.
This can probably be improved by upgrading Mint.WebSocket.decode/2
to allow partial failure as described above but I have doubts that all of the strict cases can be resolved and also have Mint.WebSocket abstractly support extensions
For example, Mint.WebSocket cannot emit fragments for frames which are compressed because then the extension could not decompress them.
I suspect that there is some trivial validation (such as correct RSV bits) which is possible without backing ourselves into a corner, however.
the remaining cases not covered by #15 have to do with catching invalid UTF-8 and failing fast on it
one of the cases sends a valid UTF-8 message but splits the message into chops on the octet boundary instead of on the codepoint boundary
this defeats any check we can do to detect invalid UTF-8 greedily because we only use String.valid?/1
which (to my knowledge) cannot detect if a binary is "potentially valid if there were more codepoints". I'm sure if we had a finegrained control over the UTF-8 checking engine we could make this work but for now I'm happy just buffering the messages and performing the UTF-8 check when the text frame is fully constructed
A handful of the autobahn case results come out as non-strict.
I believe the NON-STRICT results come from two behaviors of Mint.WebSocket: