jgm / djot

A light markup language
https://djot.net
MIT License
1.62k stars 43 forks source link

Parsing inconsistency: lazy paragraph lines in lists #290

Closed rauschma closed 3 months ago

rauschma commented 3 months ago

I found the following parsing inconsistency. It’s obvious why it happens but it feels like an argument in favor of requiring indentation – because without it you once again have to be careful about not starting paragraph lines with special characters (thematic breaks, list item bullets, etc.).

Inside a list

Input:

* abc
- - -

Output (the three hyphens become an <hr> after the list and not regular text inside the list item’s paragraph):

<ul>
<li>
abc
</li>
</ul>
<hr>

Normal paragraph

Input:

abc
- - -

Output:

<p>abc
- - -</p>
jgm commented 3 months ago

I wouldn't call it an inconsistency; it's just that the behavior of lazy lines isn't fully described in the (rather sketchy) documentation.

The basic rule the parser uses is this: If a line would not start any other kind of block (besides a paragraph), and the last open container is a paragraph, then it gets added to that paragraph.

Since this line would be a thematic break, it doesn't qualify as a lazy paragraph continuation.

Certainly it's a good practice not to use laziness if you want to avoid this sort of thing. But I don't think that's an argument for banishing laziness.

rauschma commented 3 months ago

Ah, I understand: For lazy lines, you have to be more careful w.r.t. how you start your lines.

rauschma commented 3 months ago

The basic rule the parser uses is this: If a line would not start any other kind of block (besides a paragraph), and the last open container is a paragraph, then it gets added to that paragraph.

As an aside: It’s interesting that this rule applies at any level of nesting (via indentation or quoting) – each of the following pairs of lines produces the same output.

> > > abc
> > > def

> > > abc
> > def

> > > abc
> def

> > > abc
def
vassudanagunta commented 3 months ago

Yes, that behavior is very counterintuitive:

>> This is inside the nested block quote.
>And so is this.
>And this.
>
>But not this.

renders like this.

But that link also shows the dominance of that interpretation, so it can't really be changed. Gruber's original Markdown also does the same.