turquoiseowl / i18n

Smart internationalization for ASP.NET
Other
556 stars 156 forks source link

Texts with ampersands cannot be localized #355

Closed zingix closed 5 years ago

zingix commented 6 years ago

I have to following text localized in my PO file.

Qualität & Quantität

The text is requested by the web page using HTTP GET and returned by the web server using JSON. When I dig into the code, I can see that the lookup in the concurrent dictionary is with the following string:

Qualität \u0026amp; Quantität

This is also true for other special HTML characters like ©

turquoiseowl commented 6 years ago

I've just done a quick test here adding your message string to an HTML file and all works as expected.

Did you say the source file is a .json file? If so I'm not clear what the correct behaviour should be and that it is not what you are experiencing (?) given that & is an HTML thing.

zingix commented 6 years ago

I think the error comes from a combination of the JSON response and the localization. The response header looks like this:

Cache-Control:private
Content-Length:1376
Content-Type:application/json; charset=utf-8

Whereas the response itself is the following:

[ {
        "Text": "Qualität \u0026amp; Quantität",
        "Id": 1000919
    }, {
        "Text": "Self-organisation",
        "Id": 1000914
    }
]

I think the ASP.Net JSON encoding converts the & to \u0026amp; and because the request handler kicks in after the encoding, the value cannot be looked up.

I tried changing the value in my source file to \u0026amp; but this does not update the dictionary accordingly. It seems this is a common issue and e. g. solved by always converting the characters to Unicode encoded literals (https://golang.org/pkg/encoding/json/).

An example to convert the Unicode character back for looking it up in the localization directory is given in the following StackExchange answer: https://stackoverflow.com/questions/1615559/convert-a-unicode-string-to-an-escaped-ascii-string#1615860

turquoiseowl commented 6 years ago

The following comes up as valid JSON using jsonlint.com:

{
    "Text": "Qualität & Quantität"
}

Not wanting to pass the buck but it looks to me like the problem is with ASP.NET's handling of JSON. How is i18n supposed to know you don't actually want \u0026amp; in the response?

zingix commented 6 years ago

You are right, i18n has no way of knowing what you want. But when the JSON standard defines that an ampersand can also be represented by \u0026 then this should be handled correctly when looking up the values from the dictionary.

I created a pull request that fixes the issue. What do you think? #356

turquoiseowl commented 5 years ago

Closing for now. Please re-open is appropriate. Thanks.