vmg / redcarpet

The safe Markdown parser, reloaded.
MIT License
4.99k stars 526 forks source link

Inconstant List Indentation #615

Open deckar01 opened 7 years ago

deckar01 commented 7 years ago

Input

 - Item 1
   - Item 2
     - Item 3
       - Item 4
         - Item 5

Output

<ul>
    <li>Item 1
        <ul>
            <li>Item 2
                <ul>
                    <li>Item 3</li>
                    <li>Item 4
                        <ul>
                            <li>Item 5</li>
                        </ul>
                    </li>
                </ul>
            </li>
        </ul>
    </li>
</ul>

screen shot 2017-04-06 at 1 24 30 pm

Expected Output

Each bullet with more indentation than the previous bullet should start a child list.

References

https://gitlab.com/gitlab-org/gitlab-ce/issues/3470

deckar01 commented 7 years ago

It looks like the list parsing recursion is happening in ext/redcarpet/markdown.c#L1886-L1887.

has_next_uli = prefix_uli(data + beg + i, end - beg - i);
has_next_oli = prefix_oli(data + beg + i, end - beg - i);

It looks like the logic for determining if the following line is a sibling list item is not properly matching the indentation. Both variables should be false, and it should parse the following lines as a nested block.

progmem commented 7 years ago

I believe I have a potential fix for this issue, and will open up a pull request. I noticed when peeking at the work buffers that indentation was being lost or not properly handled, so this fix focuses on the indentation handler at ext/redcarpet/markdown.c#L1873-L1876 .

while (i < 4 && beg + i < end && data[beg + i] == ' ')
  i++;

pre = i;

This fix works by initializing pre to a value of 4, and substituting the current fixed value of 4 for pre. This allows the initial indentation check to fire correctly, and additionally, allows for that indentation range to narrow down to the amount of indentation we need.

This fix does, however, expose a quirk that I've noted in the commit message for markdown.c; in short, max_nesting has an effect on lists, with each level from that point adding n to the current nest level (1 + 2 + ... + n). This means the input supplied in the original message will work, as it hits a nest level of 15. However, a list 6 levels deep will have issues due to the max_nesting being exceeded (21 vs. 16). This was verified by monitoring the nest level as the process was running. A temporary hard-coding of max_nesting to 64 allowed a test of 10 levels deep to pass without issue.

noraj commented 6 years ago

Related issue in other projects

github-markup solution: use commonmarker that respect CommonMark (article

Your real problem is CommonMark conformance, as long as this issue is not fixed there will be a lot of others like this one.

RaymondFallon commented 5 years ago

Has there been any progress on this issue?

midnight-wonderer commented 8 months ago

Wow, didn't know a bug related to simple nested lists, which I encountered today, has such a long history. I don't write C that much, so I don't know why it is so difficult to fix the issue.

Anyway, thank you for creating/maintaining Redcarpet.

But do people here have any alternative gem recommendations? I want something similar to Redcarpet in the way that it lets me customize output HTML.

For example:

# text

can be rendered to

<h1><span class='custom'>text</span></h1>

This bug is a deal breaker for me; any recommended alternatives are welcome.