gomarkdown / markdown

markdown parser and HTML renderer for Go
Other
1.38k stars 172 forks source link

A benchmark for Parse fails with panic: runtime error: invalid memory address or nil pointer dereference #229

Closed malisetti closed 2 years ago

malisetti commented 2 years ago

This benchmark fails with following errors

Running tool: /home/malisetti/go/bin/go test -benchmem -run=^$ -bench ^BenchmarkRender$ github.com/quii/learn-go-with-tests/blogrenderer

goos: linux
goarch: amd64
pkg: github.com/quii/learn-go-with-tests/blogrenderer
cpu: 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz
BenchmarkRender-8       panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x5e130f]

goroutine 38 [running]:
github.com/gomarkdown/markdown/parser.(*Parser).finalize(...)
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/parser.go:206
github.com/gomarkdown/markdown/parser.(*Parser).addChild(0xc00014e000, {0x6bd870?, 0xc00009e180})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/parser.go:211 +0x3af
github.com/gomarkdown/markdown/parser.(*Parser).addBlock(0xc00014e000, {0x6bd870, 0xc00009e180})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/block.go:392 +0xef
github.com/gomarkdown/markdown/parser.(*Parser).renderParagraph(0xc0000142c0?, {0xc0000142c0, 0x10?, 0x10})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/block.go:1548 +0x10a
github.com/gomarkdown/markdown/parser.(*Parser).paragraph(0xc00014e000, {0xc0000142c0, 0xe, 0x10})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/block.go:1711 +0x53f
github.com/gomarkdown/markdown/parser.(*Parser).block(0xc00014e000, {0xc0000142c0?, 0xc000143ad8?, 0x5aa42e?})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/block.go:373 +0xf3c
github.com/gomarkdown/markdown/parser.(*Parser).Parse(0xc00014e000, {0xc0000142c0?, 0x63f640?, 0xc00009e0c0?})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/parser/parser.go:272 +0x3e
github.com/gomarkdown/markdown.Parse({0xc0000142c0?, 0x0?, 0x99?}, 0x100000063f640?)
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/markdown.go:53 +0x65
github.com/gomarkdown/markdown.ToHTML({0xc0000142c0?, 0x66b970?, 0xe?}, 0x66a630?, {0x0, 0x0})
    /home/malisetti/go/pkg/mod/github.com/gomarkdown/markdown@v0.0.0-20211212230626-5af6ad2f47df/markdown.go:77 +0x45
github.com/quii/learn-go-with-tests/blogrenderer.newPostVM(...)
    /home/malisetti/learn-go-with-tests/blogrenderer/renderer.go:53
github.com/quii/learn-go-with-tests/blogrenderer.(*PostRenderer).Render(0xc000060720, {0x6bbcc0, 0x859a80}, {{0x66a764, 0xb}, {0x66ea22, 0x15}, {0x66b970, 0xe}, {0xc00006a000, ...}})
    /home/malisetti/learn-go-with-tests/blogrenderer/renderer.go:38 +0x125
github.com/quii/learn-go-with-tests/blogrenderer_test.BenchmarkRender(0xc000132000)
    /home/malisetti/learn-go-with-tests/blogrenderer/renderer_test.go:68 +0x1b3
testing.(*B).runN(0xc000132000, 0x64)
    /home/malisetti/go/src/testing/benchmark.go:193 +0x102
testing.(*B).launch(0xc000132000)
    /home/malisetti/go/src/testing/benchmark.go:334 +0x1c5
created by testing.(*B).doBench
    /home/malisetti/go/src/testing/benchmark.go:284 +0x6c
exit status 2
FAIL    github.com/quii/learn-go-with-tests/blogrenderer    0.008s
FAIL
onionltd commented 2 years ago

Can confirm that the issue is reproducible by running the following tests. It works when the default parser is used instead.

To run the tests:

$ EMAIL_SMTP_PORT=a EMAIL_SMTP_HOST=a BASE_URL=http://example.com/ EMAIL_NOREPLY=aa DATABASE_URL=aa JWT_SECRET=fff \
go test -run FullMarkdown
miekg commented 2 years ago

which snippet from that tests causes the crash?

onionltd commented 2 years ago

which snippet from that tests causes the crash?

The input is not an issue. It's always the first call to ToHTML that fails.

miekg commented 2 years ago

ok, then to rephrase, what is the most minimal reproduction you can get that fails? Pref. not using any your wrappers

onionltd commented 2 years ago

https://go.dev/play/p/6M6Y-OtSnrh

package main

import (
    "fmt"
    "html"

    "github.com/gomarkdown/markdown"
    htmlrenderer "github.com/gomarkdown/markdown/html"

    mdparser "github.com/gomarkdown/markdown/parser"
)

var inputs = map[string]string{
    "# Hello World":                      `<h1>Hello World</h1>`,
    "Hello <b>Beautiful</b> World":       `<p>Hello &lt;b&gt;Beautiful&lt;/b&gt; World</p>`,
    "![](http://example.com/hello.jpg)":  `<p><img src="http://example.com/hello.jpg" alt="" /></p>`,
    "Go to http://example.com/hello.jpg": `<p>Go to <a href="http://example.com/hello.jpg">http://example.com/hello.jpg</a></p>`,
    `-123
        -456
        -789`: `<p>-123<br />
        -456<br />
        -789</p>`,
    `
        - **Option 1**
        - *Option 2*
        - ~~Option 3~~`: `<ul>
        <li><strong>Option 1</strong><br /></li>
        <li><em>Option 2</em><br /></li>
        <li><del>Option 3</del><br /></li>
        </ul>`,
    `Please add:
        – SEND_SMS
        – RECEIVE_SMS
        – READ_PHONE_STATE
        This will allow to send and receive SMS and get the IMEI No. in our app.

        Thanks!`: `<p>Please add:<br />
        – SEND_SMS<br />
        – RECEIVE_SMS<br />
        – READ_PHONE_STATE<br />
        This will allow to send and receive SMS and get the IMEI No. in our app.</p>

        <p>Thanks!</p>`,
}

func main() {
    mdExtns := 0 |
        mdparser.Tables |
        mdparser.Autolink |
        mdparser.FencedCode |
        mdparser.Titleblock |
        mdparser.Strikethrough |
        mdparser.DefinitionLists |
        mdparser.NoIntraEmphasis |
        mdparser.HardLineBreak

    parser := mdparser.NewWithExtensions(mdExtns)

    htmlFlags := 0 |
        htmlrenderer.UseXHTML |
        htmlrenderer.Smartypants |
        htmlrenderer.SmartypantsFractions |
        htmlrenderer.SmartypantsDashes |
        htmlrenderer.SmartypantsLatexDashes

    fullRenderer := htmlrenderer.NewRenderer(htmlrenderer.RendererOptions{Flags: htmlFlags})

    for input, _ := range inputs {
        fmt.Printf("%q\n", input)
        sanitizedInput := html.EscapeString(input)
        markdown.ToHTML([]byte(sanitizedInput), parser, fullRenderer)
    }
}
miekg commented 2 years ago

thanks. You're reusing the parser the creates the crash.

    for input, _ := range inputs {
        fmt.Printf("%q\n", input)
        sanitizedInput := gohtml.EscapeString(input)
        parser := parser.NewWithExtensions(mdExtns)
        ToHTML([]byte(sanitizedInput), parser, fullRenderer)
    }

makes the crash go away.

this should be documented, but unsure

onionltd commented 2 years ago

Why a renderer can be reused but not a parser? That looks like a bug.

kjk commented 2 years ago

We could reset the state but no one did the work and there's almost zero benefit over creating a new parser.

I have no plans on working on that.