donpark / html2jade

Converts HTML to Jade template. Not perfect but useful enough for non-daily conversions.
MIT License
1.18k stars 156 forks source link

it turns texts to something wrong #70

Closed jzthekeeper closed 10 years ago

jzthekeeper commented 10 years ago

like this: title | & #22823;& #25805;& #30424;& #25163;

jzthekeeper commented 10 years ago

image

donpark commented 10 years ago

It could be surrogate pair handling issue. I'll take a look if you can provide the source HTML file.

donpark commented 10 years ago

FYI, html2jade currently has limited character set support. So if your original HTML file had Chinese characters as-is. Output Jade file will contain HTML character entity versions of those characters instead. When converted back to HTML using Jade, output HTML will also have character entities.

weirongxu commented 10 years ago

I think his source file like this.

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>大操盘手</title>
    </head>
    <body>
    </body>
</html>

You can try to add --donotencode option to run.

donpark commented 10 years ago

Thanks @weirongxu. I forgot about using donotencode option.

With --donotencode option, output is:

doctype html
html(lang='en')
  head
    meta(charset='UTF-8')
    title 大操盘手
  body

Is this not the expected output?

jzthekeeper commented 10 years ago

yeah, finally I used shell command like: html2jade some.html --donotencode. thanks!