jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.88k stars 3.39k forks source link

Allow multiple-line setext headings like commonmark #4110

Open andrewacashner opened 7 years ago

andrewacashner commented 7 years ago

If you hard-wrap your source code at, say, 80 columns, it is not possible to write long section headers. If we only use ATX headers, the syntax requires a blank line before the header already; it seems like you would only need to require a blank line after the header to allow this behavior.

End of first section.

## Very Long Section Header that Needs to Be Written on
   Two Lines

Beginning of next section. 

I know there is a similar closed issue, but that issue includes both setex and ATX headers, and table headers as well, which is too complicated. I'm only requesting that for ATX headers, everything between the hash marks and the blank line be considered part of the header. Presently pandoc does capture the whole header above but it inserts a line break where there is one in the source. I would like the whole string to be treated the same way as if it were on a single line.

Another way to put this is that I think pandoc should ignore single newlines in ATX headers, just as it does for "lazy" block quotes with >:

End of first section.

> Very Long Block Quote that Can Be Written on
  Two Lines

Beginning of next section. 
jgm commented 7 years ago

CommonMark does not allow this, and so we're not going to either. I don't want to further increase fragmentation.

CommonMark does now allow multiline Setext headers, however:

Line one
and two
-------

and I think we should support that.

halloleo commented 5 years ago

Not as a nudge, just to be in the know: Any movement on this?

tarleb commented 5 years ago

For the time being, you can use a workaround like this: https://stackoverflow.com/a/52193443/2425163.

halloleo commented 5 years ago

I don't understand. The stackoverflow answer talks about adding line-breaks to the output while here we want here is line-breaking section headers in the input without line-breaks in the output, right?

tarleb commented 5 years ago

Ah yes, I got that backwards.

rose00 commented 4 years ago

I got here thinking of filing a bug, which I suppose would have been titled

Allow multiple-line atx headings unlike commonmark

I have noticed that pandoc (not in -fcommonmark mode) will sometimes gather together multiple-lines header, if there are text spans that make the lines "stick together". For example, this code creates a header by combining the two lines:

### _line one
 and line two_

If you leave out the underscores, the header is just line one.

This accidental behavior is useful to me, because I sometimes favor 80-column formats for the same reasons as other requesters.

Here are some more test cases showing the workarounds:

$ pandoc -v | head -1
pandoc 2.7.3
$ echo $'### line one\nand line two' | pandoc -tnative
[Header 3 ("line-one",[],[]) [Str "line",Space,Str "one"]
,Para [Str "and",Space,Str "line",Space,Str "two"]]
$ echo $'### _line one\nand line two_' | pandoc -tnative
[Header 3 ("line-one-and-line-two",[],[]) [Emph [Str "line",Space,Str "one",SoftBreak,Str "and",Space,Str "line",Space,Str "two"]]]
$ echo $'### [line one\nand line two]{}' | pandoc -tnative
[Header 3 ("line-one-and-line-two",[],[]) [Span ("",[],[]) [Str "line",Space,Str "one",SoftBreak,Str "and",Space,Str "line",Space,Str "two"]]]

I wonder, do these workarounds exploit bugs in pandoc, or features?

Also I wonder, moving forward, how many committee-driven compromises from commonmark, such as (apparently) no line continuation for ATX headers, will circle back around and confront us pandoc users?

jgm commented 4 years ago

I wonder, do these workarounds exploit bugs in pandoc, or features?

Neither bug nor features, but accidental and unplanned consequences of the way pandoc parses Markdown.

Commonmark has made a principled decision to prioritize block-level parsing: you can discern the block-level structure of the document before parsing any of it as inline content. That has a number of advantages, and I think it's the right decision.