mity / md4c

C Markdown parser. Fast. SAX-like interface. Compliant to CommonMark specification.
MIT License
756 stars 138 forks source link

Question about heading anchor links #143

Closed drwpow closed 3 years ago

drwpow commented 3 years ago

I’m using this library within markdown-wasm; apologies if I’m opening an issue in the wrong spot. We are using this for Skypack README generation, and someone raised an issue about the heading anchors differing between this and GitHub.

Example:

<!-- md4c -->
<h2><a id="default-named-export" href="#default-named-export"></a>Default &amp; named export: <code>writableDerived()</code></h2>

<!-- GitHub -->
<h2><a id="user-content-default--named-export-writablederived" href="#default--named-export-writablederived"></a>Default &amp; named export: <code>writableDerived()</code></h2>

For GitHub users that compose markdown there, it creates an issue where the anchors they see while composing, and write into their README, don’t end up being the same as what this library generates. So self-referencing READMEs are problematic—either they work in this library but not GitHub, or vice-versa.

Just curious if there’s a chance that the anchor ID generation for this library and GitHub’s could more closely match. And if you’d be open to a PR for this. Thanks!

mity commented 3 years ago

MD4C (this project) does not generate it, the HTML generator we provide generates no IDs. Therefore I assume markdown-wasm reuses "just" our parser itself and generates the HTML with its own custom generator, and hence it is a feature they add on top of what MD4C provides. You should therefore report the bug there.

But FYI, the whole situation in the Markdown world about this is quite unsatisfactory right now:

The CommonMark specification does not specify how IDs of anything should be generated. Many implementation therefore do not support it at all, or follow their own incompatible ideas how it should look.

Particularly, the GFM specification, which should describe the GFM extensions on top of the CommonMark, does not cover it either. AFAIK, cmark-gfm (the github's Markdown parser) does not implement so I can only assume github.com adds it in some extra post-processing.

There are some long-term discussions about adding it to CommonMark specification, e.g. here: https://talk.commonmark.org/t/feature-request-automatically-generated-ids-for-headers/115/31, but the Commonmark spec is very slow in absorbing new features.

drwpow commented 3 years ago

Thanks for the info. Yes hopefully at least some detail is added in CommonMark eventually. Thank you for this great project!