Python-Markdown / markdown

A Python implementation of John Gruber’s Markdown with Extension support.
https://python-markdown.github.io/
BSD 3-Clause "New" or "Revised" License
3.72k stars 857 forks source link

Unordered list indentation - discussion about John Gruber’s Markdown vs. CommonMark #1204

Closed thernstig closed 2 years ago

thernstig commented 2 years ago

https://python-markdown.github.io/#differences states:

The syntax rules clearly state that when a list item consists of multiple paragraphs, "each subsequent paragraph in a list item must be indented by either 4 spaces or one tab" (emphasis added). However, many implementations do not enforce this rule and allow less than 4 spaces of indentation. The implementers of Python-Markdown consider it a bug to not enforce this rule. This applies to any block level elements nested in a list, including paragraphs, sub-lists, blockquotes, code blocks, etc. They must always be indented by at least four spaces (or one tab) for each level of nesting. In the event that one would prefer different behavior, tab_length can be set to whatever length is desired. Be warned however, as this will affect indentation for all aspects of the syntax (including root level code blocks).

The syntax rules are based on John Gruber's initial spec. But John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. In fact, CommonMark explains this in a way that directly contradicts the above section saying:

1.2 Why is a spec needed? John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. Here are some examples of questions it does not answer:

How much indentation is needed for a sublist? The spec says that continuation paragraphs need to be indented four spaces, but is not fully explicit about sublists. It is natural to think that they, too, must be indented four spaces, but Markdown.pl does not require that. This is hardly a “corner case,” and divergences between implementations on this issue often lead to surprises for users in real documents. (See this comment by John Gruber.)

Then what is Markdown.pl? We can read this quote from CommonMark:

It was developed by John Gruber (with help from Aaron Swartz) and released in 2004 in the form of a syntax description and a Perl script (Markdown.pl) for converting Markdown to HTML.

With all this fact in place, it is clear that Python-Markdown's section about this being clear is in fact incorrect. And contradicts the spec even as defined by John Gruber, which Python-Markdown says it follows. John Gruber's own implementation does allow 2 indentations between sublists.

Why is all this important?

It is important for a plethora of reasons. The main reason is that some of the biggest Markdown formatters in the world follows Common-Mark. For example VS Code's extension markdownlint or Prettier (the worlds biggest code formatter) by use of remark. They both use 2 space indentation for sublists as default.

Now, Python-Markdown clearly states it does not follow Common-Mark, even though I believe that would have been the best opion. But as it follows John Gruber's spec, and John Gruber both states and implements 2 spaces as indentation, then I think this plugin should at least honor that.

Caveat

I have read previous discussion in issues about this, such as https://github.com/Python-Markdown/markdown/issues/3#issuecomment-63399983. But I find the explanation incorrect, as proven above. The explanation at https://github.com/Python-Markdown/markdown/issues/3#issuecomment-63399983 says "all other implementations contain a bug". Which I find untrue, and being the other way around.

waylan commented 2 years ago

Sorry, but the quote from Gruber's rules which is contained in our explanation...

each subsequent paragraph in a list item must be indented by either 4 spaces or one tab

... can only be interpreted one way by me. Nothing will persuade me otherwise. Gruber simply failed to follow his own rule in his implementation. I have not made that mistake and never will. End of discussion.

That said, you are free to implement your own behavior in an extension/fork/clone/whatever.

waylan commented 2 years ago

It is important for a plethora of reasons. The main reason is that some of the biggest Markdown formatters in the world follows Common-Mark.

Sorry, didn't see this the first time. That is a valid concern. However, I always ensure any documents I author are well formed (they pass when feed to a strict Markdown linter). A well formed document should always be indented using 4 spaces per level. As such a well formed document will always parse correctly in any Markdown or Commonmark parser. In other words, its a non-issue for me. It is also the only way to ensure consistent parsing across all non-Commonmark parsers as every one is slightly different in varying ways.