Closed dnikolovv closed 4 years ago
I have a similar problem, but with Japanese characters.
Using API Gateway the behaviour can be classified as follows:
Marking the body type X, the handler's response type is: IO (Either (ApiGatewayResponse Text) (ApiGatewayResponse X))
When X is aeson's Value
, and I put String "ネ"
(or any JSON object containing Japanese text) a different character is returned from lambda: ã
However, when X is String
or Text
and I simply put "ネ"
(OverloadedStrings is used) the correct character is returned from lambda: ネ
The problem seems to be with the utility function toJSONText
:
λ> import RIO.Text
λ> import Data.Text.IO
λ> writeFile "/tmp/wrong" $ toJSONText $ A.String "ネ"
λ> writeFile "/tmp/correct" "ネ"
$ cat /tmp/wrong
"ã"
$ cat /tmp/right
ネ
The definition of toJSONText:
toJSONText :: ToJSON a => a -> Text
toJSONText = T.pack . LazyByteString.unpack . encode
I think there is a bug here, as Aeson's encode
encodes utf-8 to ByteString, but then the unpack assumes the bytes are ASCII, so characters outside the range of ASCII will get a different meaning than what they had originally.
It is fixed by replacing the unpack
with a decodeUtf8
call, removing the T.pack
and changing the ByteString's strictness.
toJSONText :: ToJSON a => a -> Text
toJSONText = decodeUtf8 . LBS.toStrict . A.encode
That's great @talw! Could you submit a PR?
@dnikolovv
That's great @talw! Could you submit a PR?
Absolutely. Submitted.
Using this commit fixes the problem for me.
If you use
UseWithApiGateway
and return something that contains swedish characters (latin1 encoded), they will be turned into gibberish.E.g.
Tack för din beställning
instead ofTack för din beställning
.