Python-Markdown / markdown

A Python implementation of John Gruber’s Markdown with Extension support.
https://python-markdown.github.io/
BSD 3-Clause "New" or "Revised" License
3.71k stars 856 forks source link

IDs in headings behave differently compared to other markdown renderers #1422

Closed mowies closed 7 months ago

mowies commented 8 months ago

During a switch of static site generator tools in Keptn from Hugo to MkDocs we noticed that python-markdown (with toc extension) renders certain IDs for headings differently compared to other renderers. Take not of the double -- and single - in the below examples:

Input

# Heading with pre- and post-deployment things

Output of Goldmark

<h1 id="heading-with-pre--and-post-deployment-things">Heading with pre- and post-deployment things</h1>

Output of Python-Markdown (with toc extension)

<h1 id="heading-with-pre-and-post-deployment-things">Heading with pre- and post-deployment things</h1>

As you can see, the -s of the heading is rendered with either one or two - in the HTML ID. Also GitHub uses the version with double -, as you can see in this minimal example I created.

Minimal examples of the Go and Python tests above can be found in the above GitHub Gist as well. Python: link Go: link

facelessuser commented 8 months ago

Python Markdown has been like this for a long, long time. Making a change to not collapse - into a single instance of - would not be backward compatible and would likely break many user's expectations.

With that said Python Markdown provides a way where you can override the slug behavior if you so choose. You can simply craft your own slugify method and use the slugify option to specify it.

waylan commented 8 months ago

Python Markdown has been like this for a long, long time.

This was my thought as well. In fact, it is very likely that we implemented our behavior first and other implementations added their behavior later. They failed to follow our lead. In fact, I would argue that our behavior (collapsing multiple consecutive hyphens into 1) is the correct behavior and implementations which do not do that should be changed.

Regardless, we now have many thousands of existing documents which have many thousands of links pointing to them using thee existing format. A change in behavior would break many of the existing links. Therefore, we will not be making a change to the default behavior.