snoyberg / xml

Various XML utility packages for Haskell
71 stars 64 forks source link

Some node's text should be rendered unscaped #185

Closed sophicshift closed 1 year ago

sophicshift commented 1 year ago

Consider this:

ghci> renderText def $ parseLT "<script>foo = 3 < 5</script>"
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><script>foo = 3 &lt; 5</script>"

Semantically, the rendered Javascript is different from source, because inside of it content is assumed to be unscaped. The output should instead be: <script>foo = 3 < 5</script>.

I understand all text in Text.XML.Document is assumed to be unscaped (countrary to Data.XML.Types.Document). So a render-side solution would be to check if text is inside certain elements (script included), and in such cases not escape it. This could be done during the Text.XML.Document -> Data.XML.Types.Document conversion by putting such text inside ContentEntity.

sophicshift commented 1 year ago

I just noticed Document has a ToMarkup instance that properly handles that when converting it to blaze's Markup. Since Document is also used for arbitrary XML, it makes sense that this renderText issue is by design. So I'm closing this.