gomarkdown / markdown

markdown parser and HTML renderer for Go
Other
1.36k stars 171 forks source link

code block after ordered list - parsing breaks #261

Closed DenisDuev closed 1 year ago

DenisDuev commented 2 years ago

The following code block (see raw)

  1. listitem
    kinds:
    - Deployment
    - Pod

image

Rendered as:

<ol>
<li>listitem
&ldquo;`yaml
kinds:</li>
</ol>

<ul>
<li>Deployment</li>
<li>Pod
&ldquo;`</li>
</ul>

but not code block - the problem is having a yaml array inside the codeblock

See example: https://play.golang.com/p/quLdxpiDP-n

Expected

<ol>
<li>listitem</li>
</ol>

<pre><code class="language-yaml">kinds:
- Deployment
- Pod
</code></pre>

See babelmark

miekg commented 2 years ago

this seems to help, but because we only look at chunk I fear it's not a full fix:

diff --git parser/block.go parser/block.go
index eda9be7..a9672da 100644
--- parser/block.go
+++ parser/block.go
@@ -1468,6 +1468,16 @@ gatherlines:
                        }
                        *flags |= ast.ListItemContainsBlock

+               case p.extensions&FencedCode != 0 && indent < 4:
+                       // start of fenced code block (although we only check the first line
+                       // and thus ends the list
+                       end, _ := isFenceLine(chunk, nil, "")
+                       if end > 0 {
+                               *flags |= ast.ListItemEndOfList
+                               break gatherlines
+                       }
+                       *flags |= ast.ListItemContainsBlock
+
                // anything following an empty line is only part
                // of this item if it is indented 4 spaces
                // (regardless of the indentation of the beginning of the item)

this currently fails the tests with:

--- FAIL: TestDefinitionListWithFencedCodeBlock (0.00s)
    helpers_test.go:59: 
        Input   ["one:\n: def1\n\ntwo:\n: def2\n\n    ~~~\n    code\n    ~~~\n"]
        Expected["<dl>\n<dt>one:</dt>\n<dd><p>def1</p></dd>\n<dt>two:</dt>\n<dd><p>def2</p>\n\n<pre><code>code\n</code></pre></dd>\n</dl>\n"]
        Got     ["<dl>\n<dt>one:</dt>\n<dd><p>def1</p>\n\n<p>two:</p></dd>\n<dd><p>def2</p>\n\n<pre><code>code\n</code></pre></dd>\n</dl>\n"]
        Input:
        one:
        : def1

        two:
        : def2
        code
        ~~~

    Expected:
    <dl>
    <dt>one:</dt>
    <dd><p>def1</p></dd>
    <dt>two:</dt>
    <dd><p>def2</p>

    <pre><code>code
    </code></pre></dd>
    </dl>

    Got:
    <dl>
    <dt>one:</dt>
    <dd><p>def1</p>

    <p>two:</p></dd>
    <dd><p>def2</p>

    <pre><code>code
    </code></pre></dd>
    </dl>

--- FAIL: TestCodeBlock (0.00s) helpers_test.go:59: Input ["1. This is an item\n java\n int a = 1;\n\n1. This is another item\n"] Expected["

    \n
  1. This is an item\n\n
    <code class=\"language-java\">\nint a = 1;\n
    \n
  2. \n
  3. This is another item
  4. \n
\n"] Got ["
    \n
  1. This is an item
  2. \n
\n\n
<code class=\"language-java\">   int a = 1;\n
\n\n
    \n
  1. This is another item
  2. \n
\n"] Input:

  1. This is an item
           int a = 1;
  2. This is another item

    Expected:
    <ol>
    <li>This is an item
    
    <pre><code class="language-java">
    int a = 1;
    </code></pre>
    </li>
    <li>This is another item</li>
    </ol>
    
    Got:
    <ol>
    <li>This is an item</li>
    </ol>
    
    <pre><code class="language-java">   int a = 1;
    </code></pre>
    
    <ol>
    <li>This is another item</li>
    </ol>

FAIL FAIL github.com/gomarkdown/markdown 0.014s ? github.com/gomarkdown/markdown/ast [no test files] ? github.com/gomarkdown/markdown/cmd/printast [no test files] ok github.com/gomarkdown/markdown/html (cached) ok github.com/gomarkdown/markdown/md (cached) ok github.com/gomarkdown/markdown/parser (cached) FAIL

miekg commented 2 years ago

i'll note that:

1. listitem

``` yaml
kinds:
- Deployment
- Pod
```

does the right thing

miekg commented 2 years ago

moving the switch statement around only the last tests still fails, however looking at that code the codeblock is not indented 4 spaces, but only 3, which makes me thing that input is not correct?

edit: commonmark thinks it's OK, but I think this lib standardized on indent of 4? (I may be wrrong)

updated patch fixes things:

diff --git parser/block.go parser/block.go
index eda9be7..cabce91 100644
--- parser/block.go
+++ parser/block.go
@@ -1493,6 +1493,17 @@ gatherlines:
                case containsBlankLine:
                        raw.WriteByte('\n')
                        *flags |= ast.ListItemContainsBlock
+
+               case p.extensions&FencedCode != 0 && indent < 3:
+                       // start of fenced code block (although we only check the first line
+                       // and thus ends the list
+                       end, _ := isFenceLine(chunk, nil, "")
+                       if end > 0 {
+                               *flags |= ast.ListItemEndOfList
+                               break gatherlines
+                       }
+                       *flags |= ast.ListItemContainsBlock
+
                }

                // if this line was preceded by one or more blanks,

but the 3 vs 4 doesn't feel right, maybe the other '<4' in this section of code are wrong too? @kjk ?

also see #243 which is very similar

ghost commented 1 year ago

@miekg Hi, is there any progress here, because I have the same problem and found out this issue?

The code blocks are breaking if you have ordered list item or unordered list item above it without a new line:

Does not work :


  1. With ordered list item above without new line:
    
    <!DOCTYPE html>
    <html>
    <body>

My First Heading

My first paragraph.

----

## Works:
---
1. With ordered list item above and empty line: 

```html
<!DOCTYPE html>
<html>
<body>

<h1>My First Heading</h1>

<p>My first paragraph.</p>

</body>
</html>

miekg commented 1 year ago

I'm not doing active development on this

DenisDuev commented 1 year ago

It is now fixed