alogic0 / lucid-from-html

Generate Lucid code from html page
BSD 3-Clause "New" or "Revised" License
7 stars 8 forks source link

problem with html-entities (partially solved) #12

Open alogic0 opened 6 years ago

alogic0 commented 6 years ago

Examle of html when generation crashes

<a>&#xe802;</a>

if we convert this by the following code

lucidFromHtml (Options False True) "template" "<a>&#xe802;</a>"

we get

"template :: Html ()\ntemplate = do\n    a_ \"\59394\"\n"

Then we putStrLn it and get incorrect Haskell-code

template :: Html ()
template = do
    a_ ""

with an error from GHC, something like that

error:
    lexical error in string/character literal at character '\59394'

So, we loose the entity presentation and get incorrect code. Should we filter output to restore entities or, at least, escape such symbols?

alogic0 commented 6 years ago

84f53c6 partially solves this. Symbols now are escaped and code doesn't crash, but entities are still not preserved.

I'm looking for a way how to tell TagSoup to not convert hexadecimal entities into symbols.