sile / jsone

Erlang JSON library
MIT License
291 stars 71 forks source link

Encoding crashes #46

Closed pankajsoni19 closed 5 years ago

pankajsoni19 commented 5 years ago

Command executed:

using latest jsone master branch

 jsone:encode( [{severity,<<"debug">>},
             {datetime,<<"2019-07-25 16:26:19.581">>},
             {message,<<"p1_mysql_auth send packet 3: <<\"5ç\">>">>},

error log

** exception error: bad argument
     in function  jsone_encode:escape_string/4
        called as jsone_encode:escape_string(<<"ç\">>">>,
                                             <<"{\"severity\":\"debug\",\"datetime\":\"2019-07-25 16:26:19.581\",\"timestamp\":1564052179581,\"message\":\"p1_mysql_a"...>>,
     in call from jsone:encode/2 (src/jsone.erl, line 360)
sile commented 5 years ago


The reason for the above error is that the term contains a non-UTF-8 string (see below):

% Error
> io:format("~w\n", [<<"5ç">>]).
<<53,231>>  % non UTF-8
> jsone:encode(<<"5ç">>).
** exception error: bad argument
     in function  jsone_encode:escape_string/4
        called as jsone_encode:escape_string(<<"ç">>,[],<<"\"5">>,
     in call from jsone:encode/2 (/home/ohta/dev/erlang/jsone/src/jsone.erl, line 360)

% OK
> io:format("~w\n", [<<"5ç"/utf8>>]).
<<53,195,167>>  % UTF-8

> jsone:encode(<<"5ç"/utf8>>).
pankajsoni19 commented 5 years ago

Yes, I know that part. Can the lib handle this internally? Or will i need to validate input?

sile commented 5 years ago

Because RFC 8259 (JSON) says "JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8", I think that it's better to handle this case outside of jsone.

For example, you can convert the above string to an UTF-8 binary as follows:

> unicode:characters_to_binary(binary_to_list(<<"5ç">>)). 