isaiah / transit-erlang

transit format for erlang
MIT License
43 stars 12 forks source link

String encoding fails for <<239,191,190>> #11

Closed jlouis closed 10 years ago

jlouis commented 10 years ago

The string <<239, 191, 190>> is a counterexample on the isomorphism:

18> A = <<"￾">>.
<<239,191,190>>
19> transit:write(A, [{format, json}]).
<<91,34,126,35,39,34,44,34,239,191,189,34,93>>
20> transit:read(v(19), [{format, json}]).
<<239,191,189>>
21> 
isaiah commented 10 years ago
6> jsx:decode(jsx:encode(<<239,191,190>>)).
<<239,191,189>>

Looks like it's jsx, should we fire a bug there?

isaiah commented 10 years ago

This duplicates #5

jlouis commented 10 years ago

Oh, hehe, We definitely want to understand what is going on here! It seems odd it flips a bit or two in the input/output.

jlouis commented 10 years ago

Also, I say we keep this one rather than #5 as it is a simpler case.

jlouis commented 10 years ago

Actually, this is probably okay. It is U+FDD0 and this is a valid character which is never supposed to be in any encoding, but might be used internally. So it is just a question of fixing the utf8 generator and we should be fine.

jlouis commented 10 years ago

The generator now avoids these characters.