Open wooorm opened 5 years ago
The reference parser does construct these from paragraphs (similarly setext headers). That's an implementation detail, though. If we didn't care about efficiency, we could simply have a separate block parser for these and backtrack.
Implementation details should indeed be in the appendix, agreed, but what my issue is more about, is that there’s nothing in the spec arguing for, taking a maybe more clear example, why:
[
# alpha
]: https://example.com
[# alpha][]
Yields a heading.
Yes, I agree that more needs to be said about reference link definitions. I'm just not sure talking about "paragraphs" is the best way to do it.
I can’t see an easy solution.
One way would be to use “interrupting content” instead of “interrupting paragraphs”:
An indented code block cannot interrupt
a paragrapha content line. (This allows hanging indents and the like.)ATX headings need not be separated from surrounding content by blank lines, and they can interrupt
paragraphscontent lines:
...and then both definition “lines” and paragraphs fall into that category? 🤔
Another alternative would be just to say "interrupt a paragraph or a link reference definition."
Yeah, maybe that’s good! I’m not so sure about the word paragraph, as setext headings are made from that construct, but as they are headings, they aren‘t really paragraphs
setext headings are made from that construct
That's just how they're handled in the reference implementation (for parsing efficiency). As far as the spec goes, they have nothing to do with paragraphs.
Another point of confusion for me, I don‘t understand the interplay between paragraphs/setext headings/definitions:
E.g.,:
[a]: b
content?
a
=
content?
Yields:
content?
content?
What gives that there can be code after a setext heading, but not a definition? I was expecting both content?
s to be paragraphs.
It seems to me the discussion above assumes that that CommonMark.js / Dingus behavior is the spec and thus the spec needs to be updated to conform to that behavior. I would suggest that this is the wrong way to look at it (with the one exception of maintaining backward compatibility that should be maintained, since that is a CommonMark spec goal).
For example, I'm working on an implementation of the CommonMark spec. It passes all the tests, yet does NOT treat the # alpha
in @wooorm's example as a heading. It interprets it as the label of a link ref def.
As far as the spec goes, they have nothing to do with paragraphs.
The reference parser does construct these from paragraphs (similarly setext headers). That's an implementation detail, though. If we didn't care about efficiency, we could simply have a separate block parser for these and backtrack.
This is what my implementation does.
Given Markdown's principles (reader oriented), to me the way one decides is by asking: What does the following look like to most readers?
[
# alpha
]: https://example.com
[# alpha][]
Though at the end of the day, it's an unimportant corner case. If the author of the above Markdown cared about the reader, they would not write something so unnecessary! The line breaks serve no purpose.
But also, by that same note, any inefficiency resulting from rules that would require backtracking (is look-ahead considered backtracking?) would only affect such corner cases.
Problem
According to the spec text:
...is fine: it’s a proper link reference definition. This lead me to believe that true streaming, as noted in § Appendix ¶ Phase 1, wouldn‘t work because if the last apostrophe wasn’t there, we’d need to backtrack (to the start because the opening apostrophe is on the line of the destination, if it was on its own line, the definition would be valid but we’d still need to backtrack to parse the title again).
To my surprise, the following is not a link reference definition (note one less space before
a
)...
# a
is now a heading! Only then did I see that the Appendix contains:Solution
I think it’s good to mention in the main text that link reference definitions are created from paragraphs, and include a test for it. Not entirely sure how to describe this though. This will also help prevent blank lines that are currently possible in labels (GH-586)
Extra
As paragraph lines are made into actual paragraphs and definitions, setext heading lines come into play, so relating to GH-395, I think the following may also be interesting to expand upon:
Dingus: