gomarkdown / markdown

markdown parser and HTML renderer for Go
Other
1.36k stars 171 forks source link

А list followed by a header is parsed (or rendered?) the wrong way #283

Open alx-ef opened 1 year ago

alx-ef commented 1 year ago

It expects an empty line between a header and a list when parsing. This code is parsed well:

         # Header1
        * list item

        # Header2

this is wrong (header is interpreted as a list item):

         # Header1
        * list item        
        # Header2

But when rendering, the render doesn't put this blank line.

Thus, let say, you have AST like this:

Heading
  Text 'Header1'
List 'tight flags=start'
  ListItem 'flags=start end'
    Paragraph
      Text 'list item'
Heading
  Text 'Header2'

if you render it and parse the result, it will be different:

Heading
  Text 'Header1'
List 'flags=start'
  ListItem 'flags=has_block start'
    Paragraph
      Text 'list item'
    Heading
      Text 'Header2

Here is the test:

package md

import (
    "github.com/gomarkdown/markdown"
    "github.com/gomarkdown/markdown/md"
    "github.com/gomarkdown/markdown/parser"
    "github.com/stretchr/testify/require"
    "testing"
)

func Test_mdHeaderRendering(t *testing.T) {
    r := require.New(t)
    inMd := `# Header1
* list item

# Header2
`
    p := parser.NewWithExtensions(parser.CommonExtensions)
    doc1 := p.Parse([]byte(inMd))
    outMd := string(markdown.Render(doc1, md.NewRenderer()))

    t.Log("original MD:\n", inMd)
    t.Log("MD after serialization/deserialization:\n", outMd)

    p = parser.NewWithExtensions(parser.CommonExtensions)
    doc2 := p.Parse([]byte(outMd))
    r.Equal(len(doc1.GetChildren()), len(doc2.GetChildren()))

}

I don't know whether the renderer or the parser has a bug, but this behavior is inconsistent

vsysoev commented 1 year ago

Looks like list is parsed when it starts from the new paragraph. If not, list not parsed correctly.