theam / aws-lambda-haskell-runtime

⚡Haskell runtime for AWS Lambda
https://theam.github.io/aws-lambda-haskell-runtime/
Other
270 stars 48 forks source link

Swedish characters in response turn into gibberish when using API gateway #79

Closed dnikolovv closed 4 years ago

dnikolovv commented 4 years ago

If you use UseWithApiGateway and return something that contains swedish characters (latin1 encoded), they will be turned into gibberish.

E.g.

Tack för din beställning instead of Tack för din beställning.

talw commented 4 years ago

I have a similar problem, but with Japanese characters.

Using API Gateway the behaviour can be classified as follows:

Marking the body type X, the handler's response type is: IO (Either (ApiGatewayResponse Text) (ApiGatewayResponse X))

When X is aeson's Value, and I put String "ネ" (or any JSON object containing Japanese text) a different character is returned from lambda: ã However, when X is String or Text and I simply put "ネ" (OverloadedStrings is used) the correct character is returned from lambda:

talw commented 4 years ago

The problem seems to be with the utility function toJSONText:

λ> import RIO.Text
λ> import Data.Text.IO
λ> writeFile "/tmp/wrong" $ toJSONText $ A.String "ネ"
λ> writeFile "/tmp/correct" "ネ"
$ cat /tmp/wrong
"ã"
$ cat /tmp/right
ネ

The definition of toJSONText:

toJSONText :: ToJSON a => a -> Text
toJSONText = T.pack . LazyByteString.unpack . encode

I think there is a bug here, as Aeson's encode encodes utf-8 to ByteString, but then the unpack assumes the bytes are ASCII, so characters outside the range of ASCII will get a different meaning than what they had originally.

talw commented 4 years ago

It is fixed by replacing the unpack with a decodeUtf8 call, removing the T.pack and changing the ByteString's strictness.

toJSONText :: ToJSON a => a -> Text
toJSONText = decodeUtf8 . LBS.toStrict . A.encode
dnikolovv commented 4 years ago

That's great @talw! Could you submit a PR?

talw commented 4 years ago

@dnikolovv

That's great @talw! Could you submit a PR?

Absolutely. Submitted.

Using this commit fixes the problem for me.