t2ym / i18n-element

I18N Base Element for Lit and Polymer
Other
9 stars 1 forks source link

[lit-html] HTML entities in a part are shown as they are without decoding #64

Open t2ym opened 5 years ago

t2ym commented 5 years ago

[lit-html] HTML entities in a part are shown as they are without decoding

Principles

- No HTML entities in JSON

- XML entities (subset of HTML entities) in XLIFF

Notes

Root Cause

Reproducible Code

html`HTML entities follow ${'<>😉'}`; // shown as <>😉

Considerations on potential fixes

Workaround for translated strings

Workaround for parameters

t2ym commented 5 years ago

More examples

Name Source Preprocess JSON XLIFF HTML UI
non-breaking space  
non-breaking space      
non-breaking space      
non-breaking space N/A N/A    
non-breaking space (broken) N/A N/A   N/A    
less than < &lt; < &lt; &lt; <
less than &lt; &lt; < &lt; &lt; <
less than &#60; &lt; < &lt; &lt; <
less than (broken) N/A N/A &lt; N/A &amp;lt; &lt;
greater than > &gt; > > &gt; >
greater than &gt; &gt; > > &gt; >
greater than &#62; &gt; > > &gt; >
greater than (broken) N/A N/A &gt; N/A &amp;gt; &gt;
ampersand & &amp; & &amp; &amp; &
ampersand &amp; &amp; & &amp; &amp; &
ampersand &#38; &amp; & &amp; &amp; &
ampersand (broken) N/A N/A &amp; N/A &amp; &amp;
double quotation mark " " \" " " "
double quotation mark &quot; " \" " " "
double quotation mark &#34; " \" " " "
double quotation mark N/A N/A \" &quot; " "
double quotation mark (broken) N/A N/A &quot; N/A &amp;quot; &quot;
single quotation mark ' \' ' ' ' '
single quotation mark &apos; \' ' ' ' '
single quotation mark &#39; \' ' ' ' '
single quotation mark N/A N/A ' &apos; ' '
single quotation mark (broken) N/A N/A &apos; N/A &amp;apos; &apos;
cent ¢ \xA2 ¢ ¢ ¢ ¢
cent &cent; \xA2 ¢ ¢ ¢ ¢
cent &#162; \xA2 ¢ ¢ ¢ ¢
cent (broken) N/A N/A &cent; N/A &amp;cent; ¢
pound £ \xA3 £ £ £ £
pound &pound; \xA3 £ £ £ £
pound &#163; \xA3 £ £ £ £
pound (broken) N/A N/A &pound; N/A &amp;pound; &pound;
yen ¥ \xA5 ¥ ¥ ¥ ¥
yen &yen; \xA5 ¥ ¥ ¥ ¥
yen &#165; \xA5 ¥ ¥ ¥ ¥
yen (broken) N/A N/A &yen; N/A &amp;yen; &yen;
euro \u20AC
euro &euro; \u20AC
euro &#8364; \u20AC
euro (broken) N/A N/A &euro; N/A &amp;euro; &euro;
euro (broken) N/A N/A \\u20AC \u20AC \u20AC \u20AC
copyright © \xA9 © © © ©
copyright &copy; \xA9 © © © ©
copyright &#169; \xA9 © © © ©
copyright (broken) N/A N/A &copy; N/A &amp;copy; &copy;
registered trademark ® \xAE ® ® ® ®
registered trademark &reg; \xAE ® ® ® ®
registered trademark &#174; \xAE ® ® ® ®
registered trademark (broken) N/A N/A &reg; N/A &amp;reg; &reg;
winking face emoji 😉 \uD83D \uDE09 😉 😉 😉 😉
winking face emoji &#128521; \uD83D \uDE09 😉 😉 😉 😉
winking face emoji N/A N/A \uD83D \uDE09 N/A 😉 😉
winking face emoji N/A N/A 😉 &#128521; 😉 😉
winking face emoji (broken) N/A N/A &#128521; N/A &#128521; &#128521;

Raw Source Code

    return html`${bind(this)}
      <div>message with HTML entities &lt;&#128521;&gt;&quot;&amp; <br>
  non-breaking space  &nbsp;  &#160; <br>
< less than &lt;  &#60; <br>
> greater than  &gt;  &#62; <br>
& ampersand &amp; &#38; <br>
" double quotation mark &quot;  &#34; <br>
' single quotation mark (apostrophe)  &apos;  &#39; <br>
¢ cent  &cent;  &#162; <br>
£ pound &pound; &#163; <br>
¥ yen &yen; &#165; <br>
€ euro  &euro;  &#8364; <br>
© copyright &copy;  &#169; <br>
® registered trademark  &reg; &#174; <br>
      </div>
      <div>message with newlines
      next line
      last line
      </div>
    `;

Preprocessed Code

{
      'div_4': [
        'message with HTML entities &lt;\uD83D\uDE09&gt;"&amp; {1}\n  non-breaking space  &nbsp;  &nbsp; {2}\n&lt; less than &lt;  &lt; {3}\n&gt; greater than  &gt;  &gt; {4}\n&amp; ampersand &amp; &amp; {5}\n" double quotation mark "  " {6}\n\' single quotation mark (apostrophe)  \'  \' {7}\n\xA2 cent  \xA2  \xA2 {8}\n\xA3 pound \xA3 \xA3 {9}\n\xA5 yen \xA5 \xA5 {10}\n\u20AC euro  \u20AC  \u20AC {11}\n\xA9 copyright \xA9  \xA9 {12}\n\xAE registered trademark  \xAE \xAE {13} ',
        '&lt;br&gt;', // placeholder; no need to translate
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;',
        '&lt;br&gt;'
      ],
      'div_5': 'message with newlines\n      next line\n      last line '
    }

JSON

{
    "div_4": [
      "message with HTML entities <😉>\"& {1}\n  non-breaking space       {2}\n< less than <  < {3}\n> greater than  >  > {4}\n& ampersand & & {5}\n\" double quotation mark \"  \" {6}\n' single quotation mark (apostrophe)  '  ' {7}\n¢ cent  ¢  ¢ {8}\n£ pound £ £ {9}\n¥ yen ¥ ¥ {10}\n€ euro  €  € {11}\n© copyright ©  © {12}\n® registered trademark  ® ® {13} ",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>",
      "<br>"
    ]
}

XLIFF

        <source>message with HTML entities &lt;😉>"&amp; {1}
  non-breaking space       {2}
&lt; less than &lt;  &lt; {3}
> greater than  >  > {4}
&amp; ampersand &amp; &amp; {5}
" double quotation mark "  " {6}
' single quotation mark (apostrophe)  '  ' {7}
¢ cent  ¢  ¢ {8}
£ pound £ £ {9}
¥ yen ¥ ¥ {10}
€ euro  €  € {11}
© copyright ©  © {12}
® registered trademark  ® ® {13} </source>
        <target state="needs-translation">message with HTML entities &lt;😉>"&amp; {1}
  non-breaking space       {2}
&lt; less than &lt;  &lt; {3}
> greater than  >  > {4}
&amp; ampersand &amp; &amp; {5}
" double quotation mark "  " {6}
' single quotation mark (apostrophe)  '  ' {7}
¢ cent  ¢  ¢ {8}
£ pound £ £ {9}
¥ yen ¥ ¥ {10}
€ euro  €  € {11}
© copyright ©  © {12}
® registered trademark  ® ® {13} </target>
      </trans-unit>
      <trans-unit id="world-clock-container.div_4.1">
        <source>&lt;br></source>
        <target state="needs-translation">&lt;br></target><!-- placeholder; no need to translate -->
      </trans-unit>