syntax-tree / mdast-util-from-markdown

mdast utility to parse markdown
MIT License
212 stars 20 forks source link

Fix handling of reference links with inline code #24

Closed Trott closed 3 years ago

Trott commented 3 years ago

Initial checklist

Description of changes

This fixes the AST so that certain kinds of reference links are not broken when converted back to markdown.

Fixes: https://github.com/remarkjs/remark/issues/850

github-actions[bot] commented 3 years ago

Hi! It seems some of the things asked in the template are missing? Please edit your post to fill out everything.

You won’t get any more notifications from me, but I’ll keep on updating this comment, and remove it when done!

If you need it, here’s the original template ```markdown ### Initial checklist * [ ] I read the support docs * [ ] I read the contributing guide * [ ] I agree to follow the code of conduct * [ ] I searched issues and couldn’t find anything (or linked relevant results below) * [ ] If applicable, I’ve added docs and tests ### Description of changes TODO ```

Thanks, — bb

Trott commented 3 years ago

I don't know if this is the right fix for https://github.com/remarkjs/remark/issues/850, but it fixes that issue without breaking any existing tests, so I'm hopeful....

wooorm commented 3 years ago

Nice catch! It looks like definitions do get a proper identifier and label, but for link (and image?) references it’s broken.

There is probably a problem with character references and character escapes though. Take this example:

[`f`][]
[;][]
[\;][]
[;][]
[`f`;][]
[`f`\;][]
[`f`;][]

[`f`]: alpha
[;]: bravo
[\;]: charlie
[;]: delta
[`f`;]: echo
[`f`\;]: foxtrot
[`f`;]: golf

Note that one might expect that character references and -escapes and the actual value would match, but CommonMark does not:

<p><a href="alpha"><code>f</code></a>
<a href="bravo">;</a>
<a href="charlie">;</a>
<a href="delta">;</a>
<a href="echo"><code>f</code>;</a>
<a href="foxtrot"><code>f</code>;</a>
<a href="golf"><code>f</code>;</a></p>

Looking at the AST that remark produces, the definitions work properly: the character references and -escapes are handled properly in label/identifier. The same is true for link (and image?) references, but there the code (and probably other things such as emphasis), as you found out, don’t work.

github-actions[bot] commented 3 years ago

Hi! This is accepted and can go somewhere!

Team: please review this PR and use the area/* (to describe the scope of the change), platform/* (if this is related to a specific one), and semver/* and type/* labels to annotate this.

wooorm commented 3 years ago

I was able to verify that micromark does handle this correctly: https://github.com/micromark/micromark/commit/5be7549a35a5804568c017b17a5f3e43847b97b2

github-actions[bot] commented 3 years ago

Hi! This was closed. Team: If this was merged, please describe when this is likely to be released. Otherwise, please add one of the no/* labels.

wooorm commented 3 years ago

Thanks @Trott, this helped a lot. Released!