wooorm / markdown-rs

CommonMark compliant markdown parser in Rust with ASTs and extensions
https://docs.rs/markdown/1.0.0-alpha.21/markdown/
MIT License
906 stars 50 forks source link

Whitelist anchor HTML tags? #99

Closed hrydgard closed 8 months ago

hrydgard commented 9 months ago

In order to allow anchors in markdown, such as <a name="my_anchor"></a>, I have to turn on allow_dangerous_html.

Unfortunately markdown still lacks a syntax for this, although you can link to them: [jump to my_anchor](#my_anchor)

I'd like to disallow all HTML except this very particular usage, since this is still a hole in the markdown language, AFAIK. Is that possible?

ChristianMurphy commented 9 months ago

Welcome @hrydgard! đź‘‹

It is not currently a feature. It sounds more broadly like you are looking for a configurable sanitizer. This should be handled through https://github.com/wooorm/markdown-rs/issues/32

So this project can have a plugin roughly equivalent to https://github.com/rehypejs/rehype-sanitize on the JavaScript side.

hrydgard commented 9 months ago

Hi, yeah, I think something like that would help. I have a related issue though that even if I allow dangerous html, the following tag is not passed through:

<iframe src="https://discordapp.com/widget?id=293316141479362560&theme=dark" width="350" height="500" allowtransparency="true" frameborder="0"></iframe>

That seems unexpected?

wooorm commented 9 months ago

Please post the code you use. I’m pretty sure that doesn’t happen normally. It happens when you turn gfm features on, including the stripping of iframes which gfm does

hrydgard commented 9 months ago
    let mut markdown_options = markdown::Options::gfm();
    markdown_options.compile.allow_dangerous_html = true;

yes, gfm, but forcing allow_dangerous_html to true. I guess that's not enough, though it sounds like it should be :)

wooorm commented 9 months ago

Right, allow_dangerous_html doesn’t turn off the GFM tag filter turned on with gfm(). See gfm_tagfilter in CompileOptions: https://docs.rs/markdown/1.0.0-alpha.16/markdown/struct.CompileOptions.html.

I understand that you were not expecting that but I don’t see a better way. Other than adding docs to allow_dangerous_html on this? https://docs.rs/markdown/1.0.0-alpha.16/markdown/struct.CompileOptions.html#structfield.allow_dangerous_html

hrydgard commented 9 months ago

Oh, didn't realize gfm_tagfilter was even a thing. Yes, I think a comment in the docs of allow_dangerous_html is a good way to go.

The name allow_dangerous_html really feels like it should automatically allow all html tags since it doesn't make much sense to allow some dangerous but not some more benign ones! So documenting that it doesn't do that makes sense.

wooorm commented 8 months ago

Added a note! But: “I didn’t realize gfm_tagfilter was a thing” sounds like you should also read what things happen when you gfm()!

hrydgard commented 8 months ago

Yeah, you're absolutely right about that :)