asciidoctor / asciidoctor

:gem: A fast, open source text processor and publishing toolchain, written in Ruby, for converting AsciiDoc content to HTML 5, DocBook 5, and other formats.
https://asciidoctor.org
Other
4.87k stars 791 forks source link

Indentation (leading spaces) should not be significant because it's often used for readability #686

Open ge0ffrey opened 11 years ago

ge0ffrey commented 11 years ago

In .java and .xml files indentation (= leading spaces) have 0 effect on the output (xml has 1 unusual exception). In *.md files, they have effect out the output, but the rules do not prevent typical indentation for readability.

The AsciiDoc quick reference states "Lists can be indented. Leading whitespace is not significant.". Unfortunately that's not the case for *.adoc files:

Try indenting a complex, nested list with a nested list with code fragments and extra paragraphs in the listitems. Something like this should do the trick (untested):

The following types of animals exist:

* Birds

* Mammals

    ** Land-Mammals
    +
    Here's how to describe a land-mammal of these in java:
    +
    [source,java]
    ----
    String s = "Gorilla";
    ----
    +
    Subtypes:

        *** Apes
        +
        WARNING: humans are apes.

        *** Rodents

    ** Sea-Mammals
    +
    NOTE: Sea-mammals breath air, not water

* Insects

In complex lists like these, indentation is crucial for a reader to be able to recognize the structure and avoid getting lost.

PS: trailing spaces should be insignificant too.

mojavelinux commented 11 years ago

List layout is an area of AsciiDoc I am certainly open to revisiting. As I mentioned in issue #623, which has a similar theme to this one, I feel tension in the syntax when attaching blocks to a list item.

The following statement from the AsciiDoc User Guide, which you cite in this issue, is simply inaccurate.

Lists can be indented. Leading whitespace is not significant.

It's more accurate to say:

Whitespace preceding the list marker on the first line of a list item is not significant.

For the remaining blocks associated with a list item in an outline list, leading whitespace is significant (normal AsciiDoc rules). In definition lists, whitespace the precedes the definition when its on a line below the term is not significant. Again, for the remaining blocks associated with a list item in a definition list, leading whitespace is significant.

With that clarified, I'd like to move on to a possible compromise that works within the core constraints.

The AsciiDoc processors make a deep-rooted assumption that blocks are defined on the left-hand margin. This is about more than just ease of processing and speed. It's an intentional departure from indentation in XML. Fixing indentation is a silent time killer. Taking the option away removes the temptation for a writer to waste time on it.

Thus, allowing blocks to be indented would have a ripple effect through the processor logic and break a lot of existing documents. I'm not saying we will never do it, but I'd like to explore and/or rule out alternatives before we go there. For now, let's work with the constraint that blocks have to be at the left margin.

What makes a list item with complex content hard to read is the lack of a vertical guide to indicate association. Here's how your example looks today in AsciiDoc:

The following types of animals exist:

* Birds

* Mammals

    ** Land-Mammals
+
Here's how to describe a land-mammal of these in java:
+
[source,java]
----
String s = "Gorilla";
----
+
Subtypes:

        *** Apes
+
WARNING: humans are apes.

        *** Rodents

    ** Sea-Mammals
+
NOTE: Sea-mammals breath air, not water

* Insects

Once your eye leaves "Land-Mammals", it's hard to keep track of the list's depth. What might help is to allow the list continuation character (+) to align with the list marker. Here's how that looks:

The following types of animals exist:

* Birds

* Mammals

    ** Land-Mammals
    +
Here's how to describe a land-mammal of these in java:
    +
[source,java]
----
String s = "Gorilla";
----
    +
Subtypes:

        *** Apes
        +
WARNING: humans are apes.

        *** Rodents

    ** Sea-Mammals
    +
NOTE: Sea-mammals breath air, not water

* Insects

With this change, your eye can at least follow the list continuation characters vertically to keep track of where you are.

Leading whitespace in paragraphs is much easier to play with since its already allowed in the syntax. We could recognize the conceptual shift of the left margin and make whitespace equal to that offset insignificant. Here's how that looks:

The following types of animals exist:

* Birds

* Mammals

    ** Land-Mammals
    +
    Here's how to describe a land-mammal of these in java:
    +
[source,java]
----
String s = "Gorilla";
----
    +
    Subtypes:

        *** Apes
        +
        WARNING: humans are apes.

        *** Rodents

    ** Sea-Mammals
    +
    NOTE: Sea-mammals breath air, not water

* Insects

With the exception of the listing block, that matches your original proposal. When I see this syntax, I don't get an uneasy feeling like I do when I look at the first example in this comment. I think we've come a long way.

wdyt?

lordofthejars commented 11 years ago

I think that the last example is the most readable way of list layout, with a quick overview reader knows exactly the intention of lists

El dissabte 12 d’octubre de 2013, Dan Allen ha escrit:

List layout is an area of AsciiDoc I am certainly open to revisiting. As I mentioned in issue #623https://github.com/asciidoctor/asciidoctor/issues/623, which has a similar theme to this one, I feel tension in the syntax when attaching blocks to a list item.

The following statement from the AsciiDoc User Guide, which you cite in this issue, is simply inaccurate.

Lists can be indented. Leading whitespace is not significant.

It's more accurate to say:

Whitespace preceding the list marker on the first line of a list item is not significant.

For the remaining blocks associated with a list item in an outline list, leading whitespace is significant (normal AsciiDoc rules). In definition lists, whitespace the precedes the definition when its on a line below the term is not significant. Again, for the remaining blocks associated with a list item in a definition list, leading whitespace _is_significant.

With that clarified, I'd like to move on to a possible compromise that works within the core constraints.

The AsciiDoc processors make a deep-rooted assumption that blocks are defined on the left-hand margin. This is about more than just ease of processing and speed. It's an intentional departure from indentation in XML. Fixing indentation is a silent time killer. Taking the option away removes the temptation for a writer to waste time on it.

Thus, allowing blocks to be indented would have a ripple effect through the processor logic and break a lot of existing documents. I'm not saying we will never do it, but I'd like to explore and/or rule out alternatives before we go there. For now, let's work with the constraint that blocks have to be at the left margin.

What makes a list item with complex content hard to read is the lack of a vertical guide to indicate association. Here's how your example looks today in AsciiDoc:

The following types of animals exist:

  • Birds
  • Mammals

    \ Land-Mammals + Here's how to describe a land-mammal of these in java: +

    [source,java]

    String s = "Gorilla";

    + Subtypes:

    *** Apes

    + WARNING: humans are apes.

    *** Rodents

    \ Sea-Mammals + NOTE: Sea-mammals breath air, not water

  • Insects

Once your eye leaves "Land-Mammals", it's hard to keep track of the list's depth. What might help is to allow the list continuation character (+) to align with the list marker. Here's how that looks:

The following types of animals exist:

  • Birds
  • Mammals

    \ Land-Mammals + Here's how to describe a land-mammal of these in java: +

    [source,java]

    String s = "Gorilla";

    + Subtypes:

    *** Apes
    +

    WARNING: humans are apes.

    *** Rodents

    \ Sea-Mammals + NOTE: Sea-mammals breath air, not water

  • Insects

With this change, your eye can at least follow the list continuation characters vertically to keep track of where you are.

Leading whitespace in paragraphs is much easier to play with since its already allowed in the syntax. We could recognize the conceptual shift of the left margin and make whitespace equal to that offset insignificant. Here's how that looks:

The following types of animals exist:

  • Birds
  • Mammals

    \ Land-Mammals + Here's how to describe a land-mammal of these in java: +

    [source,java]

    String s = "Gorilla";

    + Subtypes:

    *** Apes
    +
    WARNING: humans are apes.
    
    *** Rodents

    \ Sea-Mammals + NOTE: Sea-mammals breath air, not water

  • Insects

With the exception of the listing block, that matches your original proposal. When I see this syntax, I don't get an uneasy feeling like I do when I look at the first example in this comment. I think we've come a long way.

wdyt?

— Reply to this email directly or view it on GitHubhttps://github.com/asciidoctor/asciidoctor/issues/686#issuecomment-26187098 .

Enviat amb Gmail Mobile

ge0ffrey commented 11 years ago

I think the last example, and therefore the improvement "allow the list continuation character (+) to align with the list marker." is progress. But the listing block still seems broken&wierd... Why doesn't the offset rule apply for that? Is there a way to submit a change request for asciidoc spec or is it set in stone?

"Fixing indentation is a silent time killer. Taking the option away removes the temptation for a writer to waste time on it." But removing all indentation kills readability I think (try it on a java file ;-). I wouldn't a mind forced indentation (like markdown does) though, to avoid the time killer problem.

mojavelinux commented 9 years ago

@peff based on recent changes to the git source code, I think you'll be interested in this change (and the related discussion in #623). I think we can do better than the current left-aligned list continuation line that feels more natural...but we do have to be careful to stay as true to the benefits of AsciiDoc as possible.

peff commented 9 years ago

@mojavelinux IMHO this is a huge benefit to readability of the source. Unfortunately, I think git is committed to the time being to generating primarily with the original asciidoc, so we can't take advantage of any improvements yet (I don't think switching tools is out of the realm of possibility, but it hasn't really been discussed seriously).

mojavelinux commented 9 years ago

IMHO this is a huge benefit to readability of the source.

I keep getting the same sense when writing AsciiDoc. I'll definitely be interested to hear your thoughts as I put forth some previews to test out (not sure when yet).

I don't think switching tools is out of the realm of possibility, but it hasn't really been discussed seriously

There's no rush, but it's probably inevitable as AsciiDoc Python continues to fall behind (and has such poor performance in comparison). Fortunately, we have lots of alternative choices (Asciidoctor, Asciidoctor.js, AsciidoctorJ), and hopefully new independent ones as we get serious about a spec.

ggrossetie commented 4 years ago

With the exception of the listing block, that matches your original proposal.

If we cannot lift this exception, I think it would be detrimental. Does it mean that admonition blocks should not be indented? In other words, under this proposal, the following is valid:

* Birds
* Mammals
  ** Sea-Mammals
  +
  NOTE: Sea-mammals breath air, not water
* Insects

But the following is not?

* Birds
* Mammals
  ** Sea-Mammals
  +
  [NOTE]
  ====
  Sea-mammals breath air, not water
  ====
* Insects

What about literal paragraphs (indented by one space). The following won't work anymore?

* open a terminal and type:
+
 $ cat hello.txt

* enjoy

The AsciiDoc processors make a deep-rooted assumption that blocks are defined on the left-hand margin. This is about more than just ease of processing and speed. It's an intentional departure from indentation in XML. Fixing indentation is a silent time killer. Taking the option away removes the temptation for a writer to waste time on it.

💯

But removing all indentation kills readability I think (try it on a java file ;-)

I don't think that's true. When you write sections you do not indent the content below a section title:

== Section 1

content.

=== Section 1.1

content.

== Section 2

content.

Similarly, the depth of list items is explicit:

* First level item
** Second level item
*** Third level item

In the above examples, we do not indent the content, yet it's readable.

Having said that, I agree that indentation can improve readability for complex lists but as stated in https://github.com/asciidoctor/asciidoctor/issues/623#issuecomment-674062391 I think that complex lists are hard to read because we have to use the + continuation symbol.

Here's how your example looks with "implicit" list continuation:

The following types of animals exist:

* Birds
* Mammals
** Land-Mammals

Here's how to describe a land-mammal of these in java:

[source,java]
----
String s = "Gorilla";
----

Subtypes:

*** Apes

WARNING: humans are apes.

*** Rodents

** Sea-Mammals

NOTE: Sea-mammals breath air, not water

* Insects
gzagatti commented 3 years ago

I am wondering if there's been any updates on this front.

I find that allowing for indentation of complex list would be a game changer for Asciidoc. To begin with, nested list are often displayed indented in published material be it on the web or in print. This is an important typesetting device to improve reading flow because it visually establishes the hierarchy between list items which a text fully flushed left is not able to convey. It is for no other reason that even code that does not require indentation (such as the Ruby source code in this repo) is usually indented.

In thread #623, I have seen some comparison between lists and headings. However, I feel they serve completely different purposes. On the one hand, headings are usually short (no more than one line) and they serve to hierarchically divide a long text. Thus, we would not use chapters in a 1,000 word essay. Therefore, a reader is more easily able to locate herself in the text by anchoring herself to the headings which are short. On the other hand, lists are used to list or itemize related concepts/ideas, from simple shopping lists to complex animal taxonomies.

The popularity of Python is testament to the fact that indentation is very easy to implement in plain text even to the least experienced programmer. It would thus be great to see it incorporated in the Asciidoc spec as an alternative to the +. As many in this thread, I feel that the + tends to clutter the text especially when list items are already complex. Anchors like + are seldom used in published material. Also, by allowing indentation it would be easier for others to transition from Markdown.

I understand that one of the most significant rules in Asciidoc is that

AsciiDoc block content is flat, meaning it's aligned to the left of the page so as to make the content as portable and "stateless" as possible.

However, the + ligature is not stateless as it needs to check the previous line to determine whether the following paragraph is part of a list block. This will then determine if the next paragraph will continue the list.

Indentation plus alignment could serve a similar purpose to the +. If a paragraph is indented, then it should look at the previous paragraph to check if it is aligned with it. If that is the case, then it should determine whether the paragraph is part of a list block. My understanding is that only literal blocks use indentation. Unless there is another case, my proposal would be to either break the use of indentation for literal blocks in lists (since there are at least two additional methods for adding them to the text). Alternatively, it would be the case of aligning them (and in fact everything else) with respect to the list block and considering this case when parsing the text.

Asciidoctor's speed is undoubtedly one of its selling point. However, I feel it is hard to justify the lack of indentation of nested list simply due to development constraints. If that was the case, we would perhaps all be writing HTML or LATEX both of which are unambiguous with respect to list hierarchy and are also a markup language. Is it a question of updating the spec or the problem is really technical?

I have been maintaining my blog with Asciidoctor and even developed a plugin of my own. So a big thanks to the developers. However, this remains one of the biggest pain points in using the language.

mojavelinux commented 2 years ago

Is it a question of updating the spec or the problem is really technical?

First and foremost its a concern for the spec. We can't be making fundamental changes to the AsciiDoc language from Asciidoctor anymore. It has to go through the spec process (even if it's just a recommendation).

When it comes up there, what I can tell you is that indentation can be very ambiguous. It always seems easy to make a language change when you are thinking about the case you want. But once you think about cases it shouldn't match but do, it gets much, much more complicated. Having worked on Asciidoctor for over a decade, I can tell you that changes like this bite you more times than not.