sirthias / pegdown

A pure-Java Markdown processor based on a parboiled PEG parser supporting a number of extensions
http://pegdown.org
Apache License 2.0
1.29k stars 218 forks source link

Fix for compound nested list issues, passes regression tests. #171

Closed vsch closed 9 years ago

vsch commented 9 years ago

Fix for compound nested list issues, passes regression tests.

License statement:

I am the original author of this fix and make it available under the same license as pegdown.

Fixes issues: #57, #123

This PR has the fixed list parsing that includes tight and loose sub-list handling that the previous fix did not address. It also does not mess up definitions because these relied on the old parser rule for list parsing. Definition list rule is now a copy of the old list parsing rule so that regression tests pass.

Change to pegdown.spec: AstText.ast changed to reflect that the new parsing of list items does not include the trailing blank lines in the item. This makes the new behavior leave out a blank line of 2 spaces at the end of the list causing a discrepancy of 3 characters for the BulletListNode. Expectation file modified to reflect new reality.

I'll create another PR in a day or two with new tests to validate compound nested lists.

sbt test result summary:

[info] Passed: Total 19, Failed 0, Errors 0, Passed 19
[success] Total time: 2 s, completed 14-Aug-2015 12:04:09 AM

Output results of parsing compound nested list. Minor differences with GitHub which makes list items above and below a blank line loose. Pegdown only makes the item following a blank line loose. It's a toss up but I think pegdown's results are more intuitive:

For:

2. XYZ 1

    XYZ 1's paragraph

    1. A 1
    2. A 2

    3. A 3

GitHub Produces:

  1. XYZ 1

    XYZ 1’s paragraph

    1. A 1
    2. A 2

    3. A 3

      A 3’s paragraph

while pegdown produces what you would expect by looking at the markdown source:

  1. XYZ 1

    XYZ 1’s paragraph

    1. A 1
    2. A 2
    3. A 3

      A 3’s paragraph

Full output: raw markdown, markdown - to let GitHub convert to HTML, HTML - converted by pegdown to render by GitHub for comparison, raw HTML for the curious.

-- raw markdown -------------------------------------------


1. B 1

    1. A 1

    2. A 2

<!-- end list -->

1. B 1
    1. A 1

    2. A 2

<!-- end list -->

1. B 1

    1. A 1
    2. A 2

<!-- end list -->

2. XYZ 1

    XYZ 1's paragraph

    1. A 1
    2. A 2

    3. A 3

        A 3's paragraph

        - XYZ
            1. abc
            1. abc
            1. abc
        - XYZ
            - abc
            fenced code block of sub-sub-list abc
            ```

        - abc
        - abc
    - XYZ

    Lorem ipsum...
  1. XYZ 2

    1. A 2
      • XYZ 2
  1. List 1 Item 1

    • sub list item 1 sub list item 1 continuation line

      sub list item 1 code block 1

      sub list item 1 paragraph

      sub list item 1 code block 2
    • sub list item 2

    List 1 Item 1 paragraph

    List 1 Item 1 code block

    List 1 Item 1 paragraph or main code block


-- markdown -----------------------------------------------
1. B 1
   1. A 1
   2. A 2

<!-- end list -->
1. B 1
   1. A 1
   2. A 2

<!-- end list -->
1. B 1
   1. A 1
   2. A 2

<!-- end list -->
1. XYZ 1

   XYZ 1's paragraph
   1. A 1
   2. A 2
   3. A 3

      A 3's paragraph
      - XYZ
        1. abc
        2. abc
        3. abc
      - XYZ
        - abc
      fenced code block of sub-sub-list abc
      ```
    - abc
    - abc
  - XYZ

  Lorem ipsum...
  1. XYZ 2
    1. A 2
      • XYZ 2
  1. List 1 Item 1

    • sub list item 1 sub list item 1 continuation line

      sub list item 1 code block 1

      sub list item 1 paragraph

      sub list item 1 code block 2
    • sub list item 2

    List 1 Item 1 paragraph

    List 1 Item 1 code block

    List 1 Item 1 paragraph or main code block

-- HTML ---------------------------------------------------

  1. B 1

    1. A 1

    2. A 2

  1. B 1
    1. A 1

    2. A 2

  1. B 1

    1. A 1
    2. A 2
  1. XYZ 1

    XYZ 1’s paragraph

    1. A 1
    2. A 2
    3. A 3

      A 3’s paragraph

      • XYZ
        1. abc
        2. abc
        3. abc
      • XYZ
        • abc

          fenced code block of sub-sub-list abc
          
        • abc

        • abc
      • XYZ

      Lorem ipsum…

  2. XYZ 2

    1. A 2
      • XYZ 2
  1. List 1 Item 1

    • sub list item 1
      sub list item 1 continuation line

      sub list item 1 code block 1
      

      sub list item 1 paragraph

      sub list item 1 code block 2
      
    • sub list item 2

    List 1 Item 1 paragraph

    List 1 Item 1 code block
    

    List 1 Item 1 paragraph or main code block


-- raw HTML -----------------------------------------------


<ol>
  <li>
    <p>B 1</p>
    <ol>
      <li>
      <p>A 1</p></li>
      <li>
      <p>A 2</p></li>
    </ol>
  </li>
</ol>
<!-- end list -->
<ol>
  <li>B 1
    <ol>
      <li>
      <p>A 1</p></li>
      <li>
      <p>A 2</p></li>
    </ol>
  </li>
</ol>
<!-- end list -->
<ol>
  <li>
    <p>B 1</p>
    <ol>
      <li>A 1</li>
      <li>A 2</li>
    </ol>
  </li>
</ol>
<!-- end list -->
<ol>
  <li>
    <p>XYZ 1</p>
    <p>XYZ 1’s paragraph</p>
    <ol>
      <li>A 1</li>
      <li>A 2</li>
      <li>
        <p>A 3</p>
        <p>A 3’s paragraph</p>
        <ul>
          <li>XYZ
            <ol>
              <li>abc</li>
              <li>abc</li>
              <li>abc</li>
            </ol>
          </li>
          <li>XYZ
            <ul>
              <li>
                <p>abc</p>
                <pre><code>fenced code block of sub-sub-list abc
</code></pre>
              </li>
              <li>
              <p>abc</p></li>
              <li>abc</li>
            </ul>
          </li>
          <li>XYZ</li>
        </ul>
        <p>Lorem ipsum…</p>
      </li>
    </ol>
  </li>
  <li>
    <p>XYZ 2</p>
    <ol>
      <li>A 2
        <ul>
          <li>XYZ 2</li>
        </ul>
      </li>
    </ol>
  </li>
</ol>
<!-- end list -->
<ol>
  <li>
    <p>List 1 Item 1</p>
    <ul>
      <li>
        <p>sub list item 1<br/>sub list item 1 continuation line</p>
        <pre><code>sub list item 1 code block 1
</code></pre>
        <p>sub list item 1 paragraph</p>
        <pre><code>sub list item 1 code block 2
</code></pre>
      </li>
      <li>
      <p>sub list item 2</p></li>
    </ul>
    <p>List 1 Item 1 paragraph</p>
    <pre><code>List 1 Item 1 code block
</code></pre>
    <p>List 1 Item 1 paragraph or main code block</p>
  </li>
</ol>

vsch commented 9 years ago

Have a new PR with new tests and updated regression tests for fixes in the parser.