jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.27k stars 3.36k forks source link

Allow list item indent level to be configured? #2210

Closed alex closed 7 years ago

alex commented 9 years ago

Given an input of

* This
* list
  * is
* nested

pandoc generates:

<ul>
<li>This</li>
<li>list</li>
<li>is</li>
<li>nested</li>
</ul>

Whereas the <li>is</li> should have it's own ul.

ousia commented 9 years ago

@alex, you need a tab or four spaces before the indented item:

* This
* list
  * is
* nested

Once added, sample works as expected.

alex commented 9 years ago

Interesting -- 2 spaces works fine on github (not sure which markdown flavor pandoc is trying to implement).

Thanks!

ousia commented 9 years ago

@alex, then this may be a bug in Github Markdown.

But pandoc implements its own extension to Markdown.

jgm commented 9 years ago

http://pandoc.org/README.html#the-four-space-rule

sils commented 9 years ago

I do think many people are using two space list indentation. Also until now whatever tool I used supported this. (GitHub, mkdocs, some other.)

I don't see any drawback for pandoc in supporting this as a feature so my vote would go for this. Would make it a lot more comfortable for me and probably a lot other people too.

ousia commented 9 years ago

http://pandoc.org/README.html#the-four-space-rule

@jgm, shouldn’t it be fixed in markdown_github, if GitHub uses a two-space rule for indentation in lists?

I don't see any drawback for pandoc in supporting this as a feature so my vote would go for this. Would make it a lot more comfortable for me and probably a lot other people too.

@sils1297, this would break compatibility if implemented in markdown as extended in pandoc.

andrewthad commented 9 years ago

I just had this spacing behavior bite me too. I also find it intuitive, given that the other markdown tools I use are happy with two spaces. I would be in favor of a change to an approach that allowed two spaces (hopefully remaining backwards compatible).

jgm commented 9 years ago

The problem is that you can't simply change four-space to two-space without affecting much else. (For example, how far do you need to indent a continuation paragraph for it to be included in the list item? How far to make a code block that's in a list item?)

In CommonMark we've tried to come up with a set of rules that make sense together and are more forgiving than the four-space rule. Eventually I'd like to make pandoc's Markdown parser compliant with CommonMark. For now, you could use the commonmark input format with pandoc -- as long as you don't need many pandoc extensions.

mikkosuonio commented 8 years ago

Github instructs to use two spaces for nested lists. It would be helpful for first-time users to have github markdown tools to support this by default.

https://help.github.com/articles/basic-writing-and-formatting-syntax/#lists

jgm commented 8 years ago

See my comments in the commonmark spec, including comments on a "two-space rule." Of course, whatever the merits of the system GitHub is currently using, it would be good if pandoc supported it in markdown_github parsing. But it isn't easy to change this without changing much else, as noted above. (Nor is it clear how the 2-space rule is supposed to interact with rules for indentation of continuation paragraphs or indented code under list items.)

mikkosuonio commented 8 years ago

I see. It seems to me that the current functionality in pandoc is reasonable, although unfortunate as GitHub instructs otherwise. Would a warning for lists or a recommendation to use commonmark for GitHub markup be good enough for first-time users?

jgm commented 8 years ago

+++ Mikko Suonio [Feb 10 16 02:22 ]:

I see. It seems to me that the current functionality in pandoc is reasonable, although unfortunate as GitHub instructs otherwise. Would a warning for lists or a recommendation to use commonmark for GitHub markup be good enough for first-time users?

The README already recommends using the 4-space indentation, which should work fine with GitHub and virtually every other Markdown implementation.

Using -f commonmark with GitHub is not a solution, since pandoc's commonmark reader doesn't parse most of the github extensions. (And GitHub doesn't use commonmark at this point.)

mikkosuonio commented 8 years ago

README indeed makes this perfectly clear. I did not notice that, since I was looking for 'github' in the docs. This is fine. Thank you for the clarification.

dattasid commented 8 years ago

Sorry about the necro post.

Using --tab-stop 2 in the command line will allow you to use 2 spaces for nesting lists. Not sure of this is a deprecated or undocumented feature, works in 1.17.0.2-win.

jgm commented 8 years ago

+++ dattasid [Apr 28 16 13:12 ]:

Using --tab-stop 2 in the command line will allow you to use 2 spaces for nesting lists. Not sure of this is a deprecated or undocumented feature, works in 1.17.0.2-win.

Good point! However, this also means that indented code blocks will be indented 2 spaces.

ZelphirKaltstahl commented 7 years ago

As it stands now, there is still a non-unified way of writing paragraphs inside sublists. I'll show what I mean by that.

First of all my pandoc command is the following:

pandoc \
--read=markdown+startnum \
--preserve-tabs \
--top-level-division=chapter \
--standalone \
--template=template-book-de.latex \
--latex-engine=xelatex \
grobgliederung.md \
-o grobgliederung.latex

latexmk -xelatex grobgliederung.latex

Now to the list examples, working ones and not working ones:

1. something

    text

2. something

    text

This example works, so I am assuming, that the needed indentation is four spaces, as in the four spaces rule. A vague question forms in my mind, how I would put indented code blocks into lists, but I ignore it for now. I want to have paragraphs is sublists, so I go ahead and try the following.

1. something

    1.1. something

        text

2. something

    text

Oops. It does not work. The text in the 1.1. list item becomes monospaced. I have a guess and remove two spaces of the indentation of that paragraph as in the following:

1. something

    1.1. something

      text

2. something

    text

And now it works. I created a paragraph in a sublist. Yay!

So it is possible to get paragraphs where they belong, but for right now it is not intuitive, why on the first level I need 4 spaces of additional indentation and on the second level I need to have 2 additional spaces. I also can only guess at how many I'd need for having a paragraph on the third level of sublist items. My guess would be "also 2" but it could also be 0.

It's not a huge issue, but it would be great to have a unified way of doing this. Maybe one could add a character to tell pandoc "This is a paragraph in a list item.". Maybe commonmark already handles these things well. I don't know commonmark, tbh.

EDIT

I take it back. The paragraph indented 6 spaces in total in the sublist is not at the right indentation level in the produced pdf. It is indented as if it was a paragraph on the first level. Screenshot:

screenshot

Which is produced from the following markdown source code:

# 4. Version {.unnumbered}

1. Abstract (Seiten: 1)

    Im Abstract werden Motivation, Problemstellung, Herangehensweise und Resultate der Arbeit kurz beschrieben.

2. Einleitung (Seiten 4)

    2.1. Motivation (Seiten: 1)

      (Ich verweise an dieser Stelle auf das Exposé der Masterarbeit.)

    2.2. Problemstellung (Seiten: 2)

More information

I am using an up-to-date pandoc version, which I compiled today from github sources using stack. The output of pandoc --version is the following:

pandoc 1.18
Compiled with pandoc-types 1.17.0.4, texmath 0.8.6.6, highlighting-kate 0.6.3
Default user data directory: /home/xiaolong/.pandoc
Copyright (C) 2006-2016 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

I am using the following latexmk version:

Latexmk, John Collins, 24 February 2016. Version 4.44

Stack is the following version:

Version 1.1.2, Git revision 0d143ff492819dd4e5b5b680b3e2ad4dc17957ff (3671 commits) x86_64 hpack-0.14.0
jgm commented 7 years ago

This example works, so I am assuming, that the needed indentation is four spaces, as in the four spaces rule. A vague question forms in my mind, how I would put indented code blocks into lists, but I ignore it for now.

You'd need four spaces indent to get into the list item, then four more for the code block, for a total of eight. This is explicit in Gruber's original Markdown syntax description.

I want to have paragraphs is sublists, so I go ahead and try the following.

1. something

    1.1. something

        text

2. something

    text

Oops. It does not work. The text in the 1.1. list item becomes monospaced.

The reason that doesn't work is that 1.1 can't start an ordered list item. See the manual for the supported ways of starting an ordered list. Since this isn't treated as a list item, text is indented relative to the containing list item and becomes a code block.

And now it works. I created a paragraph in a sublist. Yay!

No you didn't. If you look at the HTML, you'll see that you just have a regular

<p>1.1 something</p>

and not an actual list item. See above for why.

ZelphirKaltstahl commented 7 years ago

@jgm I had read the part in the original Markdown syntax description. I thought Pandoc markdown behaved differently in that way. Is there an extension / filter allowing such "continued enumerations"?

jgm commented 7 years ago

No - there's an issue somewhere in this tracker asking for the feature, but it's not currently a feature, nor is it possible to add it with a filter.

jgm commented 7 years ago

Trying to figure out why this issue is still open. I think the one possible action item would be seeing if list indentation can be cleanly separated from indented code indentation in the parser. If so, we might be able to make markdown_github behave a bit more like actual GFM in this respect, though I could give you lots of examples of GFM behavior that is inconsistent with the 2-space rule (or any other sensible rule).

ZelphirKaltstahl commented 7 years ago

@jgm I have an idea how it could be solved. I admit, it's a simple idea, but hopefully not too simple:

What if there was a special character, which was in front of a paragraph inside a list, marking it as such? Now I don't know if there is any intersection with other stuff, but here is one example of how it could look like:

1. text

    ~paragraph

    blockquote text

    1.1. text

        ~paragraph

2. text
wookayin commented 7 years ago

It would be nice if there is a configuration point for 4-space sublists!

jgm commented 7 years ago

Closing this with fix for #3511