sile / jsone

Erlang JSON library
MIT License
291 stars 71 forks source link

Encoding crashes #46

Closed pankajsoni19 closed 5 years ago

pankajsoni19 commented 5 years ago

Command executed:

using latest jsone master branch

 jsone:encode( [{severity,<<"debug">>},
             {datetime,<<"2019-07-25 16:26:19.581">>},
             {timestamp,1564052179581},
             {message,<<"p1_mysql_auth send packet 3: <<\"5ç\">>">>},
             {node,<<"ejabberd@pankaj-macbook">>},
             {pid,<<"<0.1398.0>">>},
             {file,<<"src/ejabberd_sql.erl">>},
             {module,<<"ejabberd_sql">>},
             {function,<<"log">>},
             {line,675}]).

error log

** exception error: bad argument
     in function  jsone_encode:escape_string/4
        called as jsone_encode:escape_string(<<"ç\">>">>,
                                             [{object_members,[{node,<<"ejabberd@pankaj-macbook">>},
                                                               {pid,<<"<0.1398.0>">>},
                                                               {file,<<"src/ejabberd_sql.erl">>},
                                                               {module,<<"ejabberd_sql">>},
                                                               {function,<<"log">>},
                                                               {line,675}]}],
                                             <<"{\"severity\":\"debug\",\"datetime\":\"2019-07-25 16:26:19.581\",\"timestamp\":1564052179581,\"message\":\"p1_mysql_a"...>>,
                                             {encode_opt_v2,false,false,false,
                                                            [{scientific,20}],
                                                            {iso8601,0},
                                                            string,0,0,false})
     in call from jsone:encode/2 (src/jsone.erl, line 360)
sile commented 5 years ago

Hi

The reason for the above error is that the term contains a non-UTF-8 string (see below):

% Error
> io:format("~w\n", [<<"5ç">>]).
<<53,231>>  % non UTF-8
ok
> jsone:encode(<<"5ç">>).
** exception error: bad argument
     in function  jsone_encode:escape_string/4
        called as jsone_encode:escape_string(<<"ç">>,[],<<"\"5">>,
                                             {encode_opt_v2,false,false,false,
                                                            [{scientific,20}],
                                                            {iso8601,0},
                                                            string,0,0,false})
     in call from jsone:encode/2 (/home/ohta/dev/erlang/jsone/src/jsone.erl, line 360)

% OK
> io:format("~w\n", [<<"5ç"/utf8>>]).
<<53,195,167>>  % UTF-8
ok

> jsone:encode(<<"5ç"/utf8>>).
<<"\"5\\u00e7\"">>
pankajsoni19 commented 5 years ago

Yes, I know that part. Can the lib handle this internally? Or will i need to validate input?

sile commented 5 years ago

Because RFC 8259 (JSON) says "JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8", I think that it's better to handle this case outside of jsone.

For example, you can convert the above string to an UTF-8 binary as follows:

> unicode:characters_to_binary(binary_to_list(<<"5ç">>)). 
<<"5ç"/utf8>>