atom-community / markdown-preview-plus

Markdown Preview + Community Features
https://atom.io/packages/markdown-preview-plus
Other
370 stars 88 forks source link

Keep html encoding #510

Closed harshakokel closed 3 years ago

harshakokel commented 3 years ago

Hello,

Is there a way we can maintain the HTML encoding of the characters written in markdown when we save the markdown file as HTML?

For example, I have a Ł in my markdown file which is rendered as Ł in the preview. I want that when I save the preview as HTML, the HTML file should have Ł (instead of Ł).

I do not mind customizing the javascript if required. But, I will need help :|

harshakokel commented 3 years ago

One possible fix is to use HTML Entity package (he) while saving the html.

Replaced the following line in utils.ts: ${html.body.innerHTML} with ${he.encode(html.body.innerHTML,{'useNamedReferences': true,'allowUnsafeSymbols': true})}

and added import * as he from 'he' at top.

lierdakil commented 3 years ago

Hi. CommonMark spec requires that HTML entities are converted to their Unicode representation during Markdown parsing: https://spec.commonmark.org/0.29/#entity-references. I'm pretty sure in the majority of cases it's a good thing, but for the fringe cases there is https://github.com/Manvel/markdown-it-html-entities, which could be included in MPP. As a side note, I really need to think of a way to add support for user plug-ins.

lierdakil commented 3 years ago

Hmm. No, apparently I'm wrong, due to the internal processing done on the HTML, entities get replaced anyway.

I'll be honest, I don't see how replacing Unicode with HTML entities would be useful in general, outside very specific use cases. Feel free to try to convince me otherwise, but for now I'm thinking that the simplest approach is post-processing the HTML.

harshakokel commented 3 years ago

@lierdakil, Thank you for following up on my issue. I needed HTML entities instead of Unicode entities because I want to push the HTML to a website and Unicode characters were breaking while rendering the pages over the network. I am sure there might be other ways of resolving the rendering issue. I just prefer HTML entities. So, I agree it is not a general use-case.

The post-processing step mentioned in my previous comment is working fine for me.

A side note, it would be great if the HTML template used in the mkHtml function while saving the HTML file could be made configurable. This ties back to my use-case of pushing the HTML file to the website (as blogs for instance). The custom template would allow for the website essentials.

You may close this issue.

lierdakil commented 3 years ago

A side note, it would be great if the HTML template used in the mkHtml function while saving the HTML file could be made configurable. This ties back to my use-case of pushing the HTML file to the website (as blogs for instance). The custom template would allow for the website essentials.

It's not conceptually hard to do this, f.ex. MPP could read a file from the filesystem and do some straightforward replacements on it. But I have to say that MPP isn't really intended as a publishing tool (although I've used it as such on occasion). Better results with simpler workflow can be achieved by using static site generators, like Jekyll, Hugo, Hakyll, etc. I've personally had some success with Hugo and I have a couple sites built with Hakyll (that said, I wouldn't recommend Hakyll due to initial setup being a huge chore, but on the flip side it's very flexible). In a pinch, pandoc+some scripting can be used as a simple site generator as well, and it does have reasonable support for templates. All these options will be way more flexible and likely way less awkward for publishing than using MPP, at least in my opinion.

harshakokel commented 3 years ago

Thank you @lierdakil, Hugo looks interesting. I agree static site generators are better-suited tools for my use-case. I did start building a Jekyll website at one point; shall revisit that soon.