goccy / go-json

Fast JSON encoder/decoder compatible with encoding/json for Go
MIT License
3.12k stars 148 forks source link

segmentation fault during decode #421

Open nibbleshift opened 1 year ago

nibbleshift commented 1 year ago

I am getting a segmentation fault when I am decoding json.

0  0x0000000000442a40 in runtime.throw                                                                                                      
    at /usr/local/go/src/runtime/panic.go:1040                                                                                               
 1  0x000000000045998a in runtime.sigpanic                                                                                                   
    at /usr/local/go/src/runtime/signal_unix.go:842                                                                                          
 2  0x000000000084a763 in github.com/goccy/go-json/internal/encoder.appendNormalizedHTMLString                                               
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/internal/encoder/string.go:49                                                
 3  0x000000000084880e in github.com/goccy/go-json/internal/encoder.AppendString                                                             
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/internal/encoder/string.go:28                                                
 4  0x0000000000921c50 in github.com/goccy/go-json/internal/encoder/vm.Run                                                                   
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/internal/encoder/vm/vm.go:112                                                
 5  0x000000000095006d in github.com/goccy/go-json.encodeRunCode                                                                             
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/encode.go:310                                                                
 6  0x000000000094f3cf in github.com/goccy/go-json.encode                                                                                    
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/encode.go:235                                                                
 7  0x000000000094e949 in github.com/goccy/go-json.marshal                                                                                   
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/encode.go:150                                                                                                                                                                                                             
 8  0x0000000000950a25 in github.com/goccy/go-json.MarshalWithOption
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/json.go:186
 9  0x000000000095089c in github.com/goccy/go-json.Marshal
    at /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/json.go:171

I ran the code in a debugger and I get the following:

(dlv) frame 2                                                         
> [runtime-fatal-throw] runtime.throw() /usr/local/go/src/runtime/panic.go:1040 (hits goroutine(1):1 total:1) (PC: 0x442a40)
Warning: debugging optimized function
Frame 2: /home/steven/go/pkg/mod/github.com/goccy/go-json@v0.10.0/internal/encoder/string.go:49 (PC: 84a763)
    44:         var (                                                 
    45:                 i, j int                                      
    46:         )                                                     
    47:         if valLen >= 8 {                                      
    48:                 chunks := stringToUint64Slice(s)
=>  49:                 for _, n := range chunks {
    50:                         // combine masks before checking for the MSB of each byte. We include
    51:                         // `n` in the mask to check whether any of the *input* byte MSBs were
    52:                         // set (i.e. the byte was outside the ASCII range).
    53:                         mask := n | (n - (lsb * 0x20)) |
    54:                                 ((n ^ (lsb * '"')) - lsb) |
(dlv) p chunks                                                        
[]uint64 len: 433936308545930822, cap: 433936308545930822, [(unreadable input/output error),(unreadable input/output error),(unreadable input/output error),(unreadable input/output error),...+433936308545930818 more]
(dlv) p s                                                             
(unreadable could not read string at 0x2d31302d36303032 due to input/output error)
(dlv) p i                                                             
0                                                                     
(dlv) p j                                                             
0                                                                     
(dlv)                                            
nibbleshift commented 1 year ago

Currently working on debugging and making a small example to trigger the issue.

nibbleshift commented 1 year ago
go version go1.19.5 linux/amd64

go env:

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/steven/.cache/go-build"
GOENV="/home/steven/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/steven/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/steven/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.5"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1605362033=/tmp/go-build -gno-record-gcc-switches"
nibbleshift commented 1 year ago

Object passed to json.Marshal:

github.com/benthosdev/benthos/v4/internal/docs.ComponentSpec {
        Name: "amqp_0_9",
        Type: "input",
        Status: "",
        Plugin: false,
        Summary: "\nConnects to an AMQP (0.91) queue. AMQP is a messaging protocol ...+52 more",
        Description: "\nTLS is automatic when connecting to an `amqps` URL, but custom\n...+673 more",
        Categories: []string len: 1, cap: 1, [
                "Services",
        ],
        Footnotes: "",
        Examples: []github.com/benthosdev/benthos/v4/internal/docs.AnnotatedExample len: 0, cap: 0, nil,
        Config: github.com/benthosdev/benthos/v4/internal/docs.FieldSpec {
                Name: "",
                Type: "object",
                Kind: "scalar",
                Description: "",
                IsAdvanced: false,
                IsDeprecated: false,
                IsOptional: false,
                IsSecret: false,
                Default: *interface {} nil,
                Interpolated: false,
                Bloblang: false,
                Examples: []interface {} len: 0, cap: 0, nil,
                AnnotatedOptions: [][2]string len: 0, cap: 0, nil,
                Options: []string len: 0, cap: 0, nil,
                Children: github.com/benthosdev/benthos/v4/internal/docs.FieldSpecs len: 10, cap: 10, [
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea000),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea0f8),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea1f0),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea2e8),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea3e0),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea4d8),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea5d0),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea6c8),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea7c0),
                        (*"github.com/benthosdev/benthos/v4/internal/docs.FieldSpec")(0xc000aea8b8),
                ],
                Version: "",
                Linter: "",
                Scrubber: "",
                omitWhenFn: nil,
                customLintFn: nil,},
        Version: "",}

struct definition: https://github.com/benthosdev/benthos/blob/main/internal/docs/field.go#L73

nibbleshift commented 1 year ago
(dlv) l
Goroutine 1 frame 4 at /home/steven/src/benthos-upstream/go-json/internal/encoder/vm/vm.go:112 (PC: 0x921c70)
   107:                                 break
   108:                         }
   109:                         store(ctxptr, code.Idx, p)
   110:                         fallthrough
   111:                 case encoder.OpString:
=> 112:                         b = appendString(ctx, b, ptrToString(load(ctxptr, code.Idx)))
   113:                         b = appendComma(ctx, b)
   114:                         code = code.Next
   115:                 case encoder.OpBoolPtr:
   116:                         p := loadNPtr(ctxptr, code.Idx, code.PtrNum)
   117:                         if p == 0 {
(dlv) p string(b)
"{\"name\":\"amqp_0_9\",\"type\":\"input\",\"status\":\"\",\"plugin\":false,\"summary\":\"\\nConnects to an AMQP (0.91) queue. AMQP is a messaging protocol used by various\\nmessage brokers, including RabbitMQ.\",\"description\":\"\\nTLS is automatic when connecting to an `amqps` URL, but custom\\nsettings can be enabled in the `tls` section.\\n\\n### Metadata\\n\\nThis input adds the following metadata fields to each message:\\n\\n``` text\\n- amqp_content_type\\n- amqp_content_encoding\\n- amqp_delivery_mode\\n- amqp_priority\\n- amqp_correlation_id\\n- amqp_reply_to\\n- amqp_expiration\\n- amqp_message_id\\n- amqp_timestamp\\n- amqp_type\\n- amqp_user_id\\n- amqp_app_id\\n- amqp_consumer_tag\\n- amqp_delivery_tag\\n- amqp_redelivered\\n- amqp_exchange\\n- amqp_routing_key\\n- All existing message headers, including nested headers prefixed with the key of their respective parent.\\n```\\n\\nYou can access these metadata fields using\\n[function interpolation](/docs/configuration/interpolation#bloblang-queries).\",\"categories\":[\"Services\"],\"config\":{\"type\":\"object\",\"kind\":\"scalar\",\"children\":[{\"name\":\"urls\",\"type\":\"string\",\"kind\":\"array\",\"description\":\"A list of URLs to connect to. The first URL to successfully establish a connection will be used until the connection is closed. If an item of the list contains commas it will be expanded into multiple URLs.\",\"default\":[],\"examples\":[[\"amqp://guest:guest@127.0.0.1:5672/\"],[\"amqp://127.0.0.1:5672/,amqp://127.0.0.2:5672/\"],[\"amqp://127.0.0.1:5672/\",\"amqp://127.0.0.2:5672/\"]],\"version\":\"3.58.0\",\"scrubber\":\"\\nlet pass = this.parse_url().user.password.or(\\\"\\\")\\nroot = if $pass != \\\"\\\" \\u0026\\u0026 !$pass.trim().re_match(\\\"\\\"\\\"^\\\\${[0-9A-Za-z_.]+(:((\\\\${[^}]+})|[^}])+)?}$\\\"\\\"\\\") {\\n  \\\"!!!SECRET_SCRUBBED!!!\\\"\\n}\\n\"},{\"name\":\"queue\",\"type\":\"string\",\"kind\":\"scalar\",\"description\":\"An AMQP queue to consume from.\",\"default\":\"\"},{\"name\":\"queue_declare\",\"type\":\"object\",\"kind\":\"scalar\",\"description\":\"\\nAllows you to passively declare the target queue. If the queue already exists\\nthen the declaration passively verifies that they match the target fields.\",\"is_advanced\":true,\"children\":[{\"name\":\"enabled\",\"type\":\"bool\",\"kind\":\"scalar\",\"description\":\"Whether to enable queue declaration.\",\"is_advanced\":true,\"default\":false},{\"name\":\"durable\",\"type\":\"bool\",\"kind\":\"scalar\",\"description\":\"Whether the declared queue is durable.\",\"is_advanced\":true,\"default\":true},{\"name\":\"auto_delete\",\"type\":\"bool\",\"kind\":\"scalar\",\"description\":\"Whether the declared queue will auto-delete.\",\"is_advanced\":true,\"default\":false}]},{\"name\":\"bindings_declare\",\"type\":\"object\",\"kind\":\"array\",\"description\":\"Allows you to passively declare bindings for the target queue.\",\"is_advanced\":true,\"default\":[],\"examples\":[[{\"exchange\":\"foo\",\"key\":\"bar\"}]],\"annotated_options\":[["

It appears that the fields up to and including Example were decoded, but then it failed on the AnnotatedOptions field.

nibbleshift commented 1 year ago

Value of AnnotatedOptions and call to json.Marshal:

(dlv) l
Goroutine 1 frame 10 at /home/steven/src/benthos-upstream/public/service/config.go:476 (PC: 0x11816c7)
   471: // for general use.
   472: //
   473: // Experimental: This method is not intended for general use and could have its
   474: // signature and/or behaviour changed outside of major version bumps.
   475: func (c *ConfigView) FormatJSON() ([]byte, error) {
=> 476:         return json.Marshal(c.component)
   477: }
   478:
   479: // RenderDocs creates a markdown file that documents the configuration of the
   480: // component config view. This markdown may include Docusaurus react elements as
   481: // it matches the documentation generated for the official Benthos website.
(dlv) p c.component.Config.AnnotatedOptions
[][2]string len: 0, cap: 0, nil
(dlv) 
nibbleshift commented 1 year ago

I see that the call stack shows:

(dlv) frame 4
> [runtime-fatal-throw] runtime.throw() /usr/local/go/src/runtime/panic.go:1040 (hits goroutine(1):1 total:1) (PC: 0x442a40)
Warning: debugging optimized function
Frame 4: ./go-json/internal/encoder/vm/vm.go:112 (PC: 921c70)
   107:                                 break
   108:                         }
   109:                         store(ctxptr, code.Idx, p)
   110:                         fallthrough
   111:                 case encoder.OpString:
=> 112:                         b = appendString(ctx, b, ptrToString(load(ctxptr, code.Idx)))
   113:                         b = appendComma(ctx, b)
   114:                         code = code.Next
   115:                 case encoder.OpBoolPtr:
   116:                         p := loadNPtr(ctxptr, code.Idx, code.PtrNum)
   117:                         if p == 0 {

Should this be handled as OpArray?