michalmuskala / jason

A blazing fast JSON parser and generator in pure Elixir.
Other
1.58k stars 168 forks source link

Support for escape: :binary_safe #174

Closed josevalim closed 9 months ago

josevalim commented 9 months ago

Would you support a escaping mode that escapes anything that is not valid Unicode as \x?

Our use case is listed here: https://github.com/livebook-dev/livebook/issues/2158 - the idea is that we need to send any user result or error messages to the client, and if the user gives or fetches malformed data, encoding it incorrectly and sending some feedback is better than crashing.

michalmuskala commented 9 months ago

I'm not sure I understand how would this be different from the escape: :unicode option. Could you elaborate a bit more? I'm not sure escaping as \x would be useful since it would produce invalid JSON that can't be decoded by anything, but perhaps a unicode replacement char could be used in such cases

josevalim commented 9 months ago

I wanted to escape \xBD, which currently fails:

iex(2)> Jason.encode! "\xBD", escape: :unicode
** (Jason.EncodeError) invalid byte 0xBD in <<189>>
    (jason 1.4.1) lib/jason.ex:164: Jason.encode!/2
    iex:2: (file)

But if you tell me this is invalid JSON, then please go ahead and close this, as it wouldn't make sense :)

michalmuskala commented 9 months ago

Yes, JSON only accepts UTF8, and only escapes in the form of \uXXXX that denote valid unicode codepoints, so there's generally no compliant way of representing something like \xBD. One option would be to encoding them as unicode replacement char, so we still produce something, though the conversion would be lossy.

josevalim commented 9 months ago

Well, the strings are indeed invalid, so it is either showing gibberish or a replacement char, might as well go for a replacement char, so we can close this as a duplicate of #12. :) Thank you!