cbor / cbor.github.io

cbor.io web site
75 stars 33 forks source link

Decoding invalid utf-8 gives Internal server error #83

Open Cody119 opened 2 years ago

Cody119 commented 2 years ago

Trying to decoding the following byte sequences returns a internal server error. 82 04 68 4b 43 01 20 03 ac 02 08

I think its because it encodes an invalid utf-8 string. Im pretty sure this sequence represents [4, "KC<0x01> <0x03>"] (where <0x01> and <0x03> are ASCII SOH and ETX).

You don't need any special settings enabled, just load up cbor.me, copy the sequence into the right hand

chrysn commented 2 years ago

I can't move the issue around, but it belongs to cbor-diag. I've copied it over to https://github.com/cabo/cbor-diag/issues/21 where it is tracked better.

cabo commented 2 years ago

The fix for erroring out so unceremoniously can be discussed over there, but I note that there is a bare 0xac in the input, which indeed is not valid UTF-8 unless following certain other bytes. See below what happens when I fix this to 0x4c ("L").

$ echo 82 04 68 4b 43 01 20 03 ac 02 08 | pretty2diag.rb
/Volumes/nar/Users/cabo-rescue/lib/ruby/gems/3.1.0/gems/cbor-diag-0.7.6/lib/cbor-diagnostic.rb:73:in `to_json': source sequence is illegal/malformed utf-8 (JSON::GeneratorError)
$ echo 82 04 68 4b 43 01 20 03 4c 02 08 | pretty2diag.rb
[4, "KC\u0001 \u0003L\u0002\b"]