Closed at88mph closed 2 years ago
UTF-8 is handled correctly by Jason, as JSON is defined by standards to basically work only on UTF-8 text.
Your description, however, makes it sound like the returned encoding is latin1
(judging from the 0xE9
byte, UTF-8 does not contain standalone bytes like that), which is not handled by this library (and I'd assume many other libraries).
In particular, this works just fine, when input is indeed UTF-8:
iex(1)> Jason.decode!(~S|{"name": "abcdé"}|)
%{"name" => "abcdé"}
You can always try transcoding the encoding of the string with :unicode.characters_to_binary(data, :latin1, :utf8)
, if you know the data
is indeed in latin1
.
That was exactly what I needed. Many thanks.
On Sep 30, 2021, at 8:02 AM, Michał Muskała @.***> wrote:
UTF-8 is handled correctly, as JSON is defined by standards to basically work only on UTF-8 text.
Your description, however, makes it sound like the returned encoding is latin1 (judging from the 0xE9 byte, UTF-8 does not contain standalone bytes like that), which is not handled by this library (and I'd assume many other libraries).
In particular, this works just fine, when input is indeed UTF-8:
iex(1)> Jason.decode!(~S|{"name": "abcdé"}|) %{"name" => "abcdé"} You can always try transcoding the encoding of the string with :unicode.characters_to_binary(data, latin1, utf8), if you know the data is indeed in latin1.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/michalmuskala/jason/issues/141#issuecomment-931404715, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMYFZFOXJB4JHSDCBDM5YLUER32FANCNFSM5FADNCWA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I have a Phoenix 1.6 app that is proxying JSON requests from a different domain. Some of the JSON coming in contains extended ASCII (French) characters:
And my controller is simply pulling it in and sending it back out again:
My Phoenix app is using Jason by default, but the JSON cannot be encoded on the way out:
Can Jason be told to re-encode extended characters?