evilstreak / markdown-js

A Markdown parser for javascript
7.7k stars 863 forks source link

Add support for HTML blocks. #34

Closed marijnh closed 7 years ago

bentruyman commented 12 years ago

+1 for this. This is a blocker for us.

rummik commented 12 years ago

I think the issue here is < and > escaping (plus a couple others) should be optional, and & should only be escaped when not part of an entity (Something similar to /&([a-z]+|#[0-9]+);/i)

At the same time, extending JsonML with something similar __RAW would be useful in some cases -- like, if you're running Markdown before passing things off to a templater.

jchouse commented 12 years ago

+1 fir this. Really need this feature.

footya commented 11 years ago

It is just what I need

mattly commented 11 years ago

+1, this is a blocker for us as well.

natevw commented 11 years ago

+a lot

I based my blog engine off of this, and now every time I want to use a real <i></i>, or add a table, or embed HTML5 video, or, or... I get tripped up because my Markdown library doesn't support all of Markdown :-(

spacez320 commented 11 years ago

This is the first Markdown instance that I've worked with that doesn't allow inline HTML. +1

thisandagain commented 11 years ago

+1

michaek commented 11 years ago

I'm sure you're aware of it, but I'd like to reiterate that Markdown is designed to support inline HTML. http://daringfireball.net/projects/markdown/syntax#html

evilstreak commented 11 years ago

I have no issue with supporting inline HTML. This pull request needs some tests before it's ready to be merged. It's on my list but it's several items down, so it could be a while until I get to it. If anyone else wants to contribute some tests that'd be super.

ashb commented 11 years ago

Since we haven't previously supported inline HTML this change is a potentially breaking/insecure change (with this change it is possible to add malicious <script> tags that were safe before) so I think this feature also needs to be put behind a feature/option switch that is off by default.

michaek commented 11 years ago

I think it makes sense to have inline HTML off by default. For most use cases, it's probably expected not to inline HTML.

If I get approval to do it (from my employer), I'll see about taking a look at the commit and adding some tests.

ashb commented 11 years ago

Sweet.

For added fun this approach seems to cause problems with HTML in code blocks: https://github.com/semu/node-blog/issues/3#issuecomment-8949110 which will need looking at too.

If you do it on company time we are happy to have a "Some development sponsored by company.com" in the Readme etc.

michaek commented 11 years ago

That makes sense, but it's not immediately clear to me whether HTML should always remain unescaped within code blocks. We can make that assumption, if that's what seems right, though.

Identifying HTML blocks with a regex is bound to expose some edge cases, too. After all: http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

evilstreak commented 11 years ago

If there's doubt about how a piece of markup should be interpreted, we use Babelmark to see what the general consensus is.

In this case, code blocks take priority over HTML.

michaek commented 11 years ago

I've been thinking about this. I agree that code blocks should be escaped, but I don't think anything outside code blocks should be escaped by default. Turning on global escaping may still be an option, but I have changed my position on the default.

If we're concerned about the security of escaping user strings, that seems out of scope for markdown. User strings should be handled as appropriate by the implementing application, not by this library. That seems more in keeping with the spirit of Markdown, and more consistent with other Markdown implementations.

I don't think there is any reason to try to identify HTML with a regex, so this pull request should probably be closed (and not merged).

dtao commented 11 years ago

I submitted pull request #98 with tests for this change, as @evilstreak requested.

ghost commented 11 years ago

+1