Closed marijnh closed 7 years ago
I think the issue here is <
and >
escaping (plus a couple others) should be optional, and &
should only be escaped when not part of an entity (Something similar to /&([a-z]+|#[0-9]+);/i
)
At the same time, extending JsonML with something similar __RAW would be useful in some cases -- like, if you're running Markdown before passing things off to a templater.
+1 fir this. Really need this feature.
It is just what I need
+1, this is a blocker for us as well.
+a lot
I based my blog engine off of this, and now every time I want to use a real <i></i>, or add a table, or embed HTML5 video, or, or... I get tripped up because my Markdown library doesn't support all of Markdown :-(
This is the first Markdown instance that I've worked with that doesn't allow inline HTML. +1
+1
I'm sure you're aware of it, but I'd like to reiterate that Markdown is designed to support inline HTML. http://daringfireball.net/projects/markdown/syntax#html
I have no issue with supporting inline HTML. This pull request needs some tests before it's ready to be merged. It's on my list but it's several items down, so it could be a while until I get to it. If anyone else wants to contribute some tests that'd be super.
Since we haven't previously supported inline HTML this change is a potentially breaking/insecure change (with this change it is possible to add malicious <script>
tags that were safe before) so I think this feature also needs to be put behind a feature/option switch that is off by default.
I think it makes sense to have inline HTML off by default. For most use cases, it's probably expected not to inline HTML.
If I get approval to do it (from my employer), I'll see about taking a look at the commit and adding some tests.
Sweet.
For added fun this approach seems to cause problems with HTML in code blocks: https://github.com/semu/node-blog/issues/3#issuecomment-8949110 which will need looking at too.
If you do it on company time we are happy to have a "Some development sponsored by company.com" in the Readme etc.
That makes sense, but it's not immediately clear to me whether HTML should always remain unescaped within code blocks. We can make that assumption, if that's what seems right, though.
Identifying HTML blocks with a regex is bound to expose some edge cases, too. After all: http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
If there's doubt about how a piece of markup should be interpreted, we use Babelmark to see what the general consensus is.
In this case, code blocks take priority over HTML.
I've been thinking about this. I agree that code blocks should be escaped, but I don't think anything outside code blocks should be escaped by default. Turning on global escaping may still be an option, but I have changed my position on the default.
If we're concerned about the security of escaping user strings, that seems out of scope for markdown. User strings should be handled as appropriate by the implementing application, not by this library. That seems more in keeping with the spirit of Markdown, and more consistent with other Markdown implementations.
I don't think there is any reason to try to identify HTML with a regex, so this pull request should probably be closed (and not merged).
I submitted pull request #98 with tests for this change, as @evilstreak requested.
+1
+1 for this. This is a blocker for us.