sile / jsone

Erlang JSON library
MIT License
291 stars 71 forks source link

jsone fails with encode after decode json #37

Closed d1m4gh closed 5 years ago

d1m4gh commented 6 years ago

B = jsone:decode(<<" {\"id\":\"¾H^Zžy^GID[zӈ ^VC£“æ\"} ">>), jsone:encode(B),

{badarg, [{jsone_encode,escape_string, [<<190,72,94,90,158,121,94,71,73,68,91,122,211,136,32,32, 94,86,67,163,147,230>>, [{object_members,[]}], <<"{\"id\":\"">>, {encode_opt_v2,false,false, [{scientific,20}], {iso8601,0}, string,0,0,false}], [{line,262}]},

pichi commented 6 years ago

¾ a.k.a 190 is not valid UTF-8 value.

sile commented 5 years ago

The latest RFC of JSON (RFC 8259) says

JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8 [RFC3629]. Previous specifications of JSON have not required the use of UTF-8 when transmitting JSON text. However, the vast majority of JSON-based software implementations have chosen to use the UTF-8 encoding, to the extent that it is the only encoding that achieves interoperability.

So jsone only supports UTF-8 binary when encoding. Ideally, it is also needed to check whether the input binary is a valid UTF-8 during decoding. However, because it requires additional costs, we are not going to do so for now.

benoitc commented 5 years ago

additional costs shouldn't be a reason to skip it. why not adding an option for it?

sile commented 5 years ago

@benoitc You're right. I created https://github.com/sile/jsone/pull/44 for adding the option.