goccy / go-yaml

YAML support for the Go language
MIT License
1.12k stars 129 forks source link

Performance: store path in stack to avoid copying context #438

Open yoelsusanto opened 5 months ago

yoelsusanto commented 5 months ago

Fixes #325

This PR is created to dramatically improve performance for go-yaml library especially when parsing larger YAML files.

This is the methodology that I used to arrive at the current solution:

  1. I use go test tool to create a CPU profile when doing parser.ParseBytes.
  2. When I analyzed the CPUProfile, I found that withChild and withIndex are the two functions that dominated the CPU utilization.
  3. Reading the source code, I found that both functions call parser.copy. This operation is performed repeatedly when parsing the YAML tokens.
  4. Taking a deeper look, the essence of this copy operation is to pass the correct node path when parsing child node.
  5. We can achieve the same result by managing a stack. We can push when parsing child node and pop after we finish.
  6. This way, we improve performance by avoiding excessive memory allocation caused by copying the context.tokens.

I realized the implementation can be refactored further to adjust the way of using context, but I aim for minimum change in this PR. Please let me know if you have any suggestions 🙏 . Thank you!

Checklist:

codecov-commenter commented 5 months ago

Codecov Report

Merging #438 (2018c1a) into master (4653a1b) will decrease coverage by 0.09%. The diff coverage is 96.96%.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #438 +/- ## ========================================== - Coverage 76.02% 75.94% -0.09% ========================================== Files 13 13 Lines 4692 4701 +9 ========================================== + Hits 3567 3570 +3 - Misses 866 870 +4 - Partials 259 261 +2 ```
yoelsusanto commented 5 months ago

Benchmark

package main

import (
    "bytes"
    "fmt"
    "testing"

    "github.com/goccy/go-yaml"
    goyaml2 "gopkg.in/yaml.v2"
    goyaml3 "gopkg.in/yaml.v3"
)

func prepareSampleData(line int) []byte {
    var data bytes.Buffer

    for i := 0; i < line/5; i++ {
        data.WriteString(`- propertyA: valueA
  propertyB: 2
  propertyC: test
  propertyD: example
  propertyE: example2
`)
    }

    return data.Bytes()
}

type sampleData struct {
    PropertyA string `yaml:"propertyA"`
    PropertyB string `yaml:"propertyB"`
    PropertyC string `yaml:"propertyC"`
    PropertyD string `yaml:"propertyD"`
    PropertyE string `yaml:"propertyE"`
}

func BenchmarkGoccyYAML(b *testing.B) {
    lines := []int{100, 1000, 10000}

    for _, line := range lines {
        sample := prepareSampleData(line)

        b.Run(fmt.Sprintf("%d lines", line), func(b *testing.B) {
            b.Run("github.com/goccy/go-yaml", func(b *testing.B) {
                var datas []sampleData

                for i := 0; i < b.N; i++ {
                    err := yaml.Unmarshal(sample, &datas)
                    if err != nil {
                        b.Fatal(err)
                    }
                }
            })

            b.Run("gopkg.in/yaml.v2", func(b *testing.B) {
                var datas []sampleData

                for i := 0; i < b.N; i++ {
                    err := goyaml2.Unmarshal(sample, &datas)
                    if err != nil {
                        b.Fatal(err)
                    }
                }
            })

            b.Run("gopkg.in/yaml.v3", func(b *testing.B) {
                var datas []sampleData

                for i := 0; i < b.N; i++ {
                    err := goyaml3.Unmarshal(sample, &datas)
                    if err != nil {
                        b.Fatal(err)
                    }
                }
            })
        })
    }
}

To compare before and after the fix, I have to use replace github.com/goccy/go-yaml => github.com/yoelsusanto/go-yaml v0.0.0-20240324162521-2018c1ab915b directive.

Result:

goos: darwin
goarch: arm64
pkg: github.com/yoelsusanto/benchmark-go-yaml
                                                  │ original.txt  │             fixed.txt              │
                                                  │    sec/op     │   sec/op     vs base               │
GoccyYAML/100_lines/github.com/goccy/go-yaml-12       411.7µ ± 1%   215.4µ ± 7%  -47.68% (p=0.002 n=6)
GoccyYAML/100_lines/gopkg.in/yaml.v2-12               69.28µ ± 0%   69.21µ ± 1%        ~ (p=0.240 n=6)
GoccyYAML/100_lines/gopkg.in/yaml.v3-12               91.12µ ± 0%   92.08µ ± 1%   +1.05% (p=0.004 n=6)
GoccyYAML/1000_lines/github.com/goccy/go-yaml-12     29.976m ± 3%   2.328m ± 1%  -92.23% (p=0.002 n=6)
GoccyYAML/1000_lines/gopkg.in/yaml.v2-12              697.8µ ± 0%   691.1µ ± 0%   -0.95% (p=0.002 n=6)
GoccyYAML/1000_lines/gopkg.in/yaml.v3-12              910.7µ ± 1%   919.5µ ± 1%        ~ (p=0.065 n=6)
GoccyYAML/10000_lines/github.com/goccy/go-yaml-12   3541.11m ± 1%   24.41m ± 1%  -99.31% (p=0.002 n=6)
GoccyYAML/10000_lines/gopkg.in/yaml.v2-12             7.591m ± 1%   7.596m ± 1%        ~ (p=0.937 n=6)
GoccyYAML/10000_lines/gopkg.in/yaml.v3-12             9.754m ± 1%   9.882m ± 1%   +1.31% (p=0.002 n=6)
geomean                                               2.864m        1.157m       -59.60%

                                                  │  original.txt   │               fixed.txt               │
                                                  │      B/op       │     B/op      vs base                 │
GoccyYAML/100_lines/github.com/goccy/go-yaml-12       1585.8Ki ± 0%   187.2Ki ± 0%  -88.19% (p=0.002 n=6)
GoccyYAML/100_lines/gopkg.in/yaml.v2-12                49.48Ki ± 0%   49.48Ki ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/100_lines/gopkg.in/yaml.v3-12                64.36Ki ± 0%   64.36Ki ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/1000_lines/github.com/goccy/go-yaml-12     137.455Mi ± 0%   1.828Mi ± 0%  -98.67% (p=0.002 n=6)
GoccyYAML/1000_lines/gopkg.in/yaml.v2-12               449.7Ki ± 0%   449.7Ki ± 0%        ~ (p=0.364 n=6)
GoccyYAML/1000_lines/gopkg.in/yaml.v3-12               585.1Ki ± 0%   585.1Ki ± 0%        ~ (p=0.545 n=6)
GoccyYAML/10000_lines/github.com/goccy/go-yaml-12   13025.59Mi ± 0%   19.00Mi ± 0%  -99.85% (p=0.002 n=6)
GoccyYAML/10000_lines/gopkg.in/yaml.v2-12              4.369Mi ± 0%   4.369Mi ± 0%        ~ (p=0.069 n=6)
GoccyYAML/10000_lines/gopkg.in/yaml.v3-12              5.677Mi ± 0%   5.677Mi ± 0%        ~ (p=0.498 n=6)
geomean                                                3.345Mi        809.1Ki       -76.38%
¹ all samples are equal

                                                  │ original.txt │              fixed.txt               │
                                                  │  allocs/op   │  allocs/op   vs base                 │
GoccyYAML/100_lines/github.com/goccy/go-yaml-12      6.381k ± 0%   5.338k ± 0%  -16.35% (p=0.002 n=6)
GoccyYAML/100_lines/gopkg.in/yaml.v2-12              1.037k ± 0%   1.037k ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/100_lines/gopkg.in/yaml.v3-12              1.258k ± 0%   1.258k ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/1000_lines/github.com/goccy/go-yaml-12     63.73k ± 0%   52.88k ± 0%  -17.02% (p=0.002 n=6)
GoccyYAML/1000_lines/gopkg.in/yaml.v2-12             10.04k ± 0%   10.04k ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/1000_lines/gopkg.in/yaml.v3-12             12.24k ± 0%   12.24k ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/10000_lines/github.com/goccy/go-yaml-12    649.2k ± 0%   529.9k ± 0%  -18.38% (p=0.002 n=6)
GoccyYAML/10000_lines/gopkg.in/yaml.v2-12            100.0k ± 0%   100.0k ± 0%        ~ (p=1.000 n=6) ¹
GoccyYAML/10000_lines/gopkg.in/yaml.v3-12            122.0k ± 0%   122.0k ± 0%        ~ (p=1.000 n=6) ¹
geomean                                              20.02k        18.80k        -6.12%
¹ all samples are equal

From the results, we can see:

Let me know if you need anything else 🙏

mcdee commented 5 months ago

This looks like a good change to me. Comparing with 1.11.3 using a complex ~6MB sample:

goos: linux
goarch: amd64
pkg: github.com/wealdtech/yamltest
cpu: 12th Gen Intel(R) Core(TM) i7-1265U
                 │     1.11.3     │              proposed              │
                 │     sec/op     │   sec/op     vs base               │
Decode/Decode-12   174099.6m ± 1%   798.8m ± 2%  -99.54% (p=0.002 n=6)

                 │     1.11.3      │              proposed               │
                 │      B/op       │     B/op      vs base               │
Decode/Decode-12   257761.9Mi ± 0%   494.6Mi ± 0%  -99.81% (p=0.002 n=6)

                 │   1.11.3    │             proposed              │
                 │  allocs/op  │  allocs/op   vs base              │
Decode/Decode-12   4.989M ± 0%   4.590M ± 0%  -8.00% (p=0.002 n=6)

And it looks okay compared to the last good version, 1.9.2:

goos: linux
goarch: amd64
pkg: github.com/wealdtech/yamltest
cpu: 12th Gen Intel(R) Core(TM) i7-1265U
                 │    1.9.2    │             proposed              │
                 │   sec/op    │   sec/op     vs base              │
Decode/Decode-12   750.5m ± 1%   798.8m ± 2%  +6.43% (p=0.002 n=6)

                 │    1.9.2     │              proposed              │
                 │     B/op     │     B/op      vs base              │
Decode/Decode-12   486.8Mi ± 0%   494.6Mi ± 0%  +1.60% (p=0.002 n=6)

                 │    1.9.2    │             proposed              │
                 │  allocs/op  │  allocs/op   vs base              │
Decode/Decode-12   4.181M ± 0%   4.590M ± 0%  +9.77% (p=0.002 n=6)
fzipi commented 5 months ago

Looks good to us also. See benchmarks in the related issue.

yoelsusanto commented 5 months ago

@goccy would you mind to have a look on the PR? 🙏 Thank you!