Open fzipi opened 1 year ago
It appears that this is caused by significant memory over-allocation. Parsing a 7MB YAML file with 1.9.3 I see the following information from a memory profile:
(pprof) top20
Showing nodes accounting for 255598.68MB, 99.82% of 256064.71MB total
Dropped 134 nodes (cum <= 1280.32MB)
flat flat% sum% cum cum%
255563.95MB 99.80% 99.80% 255563.95MB 99.80% github.com/goccy/go-yaml/parser.(*context).copy (inline)
16.65MB 0.0065% 99.81% 254722.96MB 99.48% github.com/goccy/go-yaml/parser.(*parser).parseSequence
11.59MB 0.0045% 99.82% 46222.01MB 18.05% github.com/goccy/go-yaml/parser.ParseBytes
6.50MB 0.0025% 99.82% 254737.01MB 99.48% github.com/goccy/go-yaml/parser.(*context).withIndex
0 0% 99.82% 23462.88MB 9.16% github.com/goccy/go-yaml.(*Decoder).DecodeContext
0 0% 99.82% 23391.25MB 9.13% github.com/goccy/go-yaml.(*Decoder).decodeInit
0 0% 99.82% 46221.83MB 18.05% github.com/goccy/go-yaml.(*Decoder).parse
0 0% 99.82% 23456.83MB 9.16% github.com/goccy/go-yaml.UnmarshalContext
0 0% 99.82% 255459.66MB 99.76% github.com/goccy/go-yaml/parser.(*parser).createMapValueNode
0 0% 99.82% 255607.10MB 99.82% github.com/goccy/go-yaml/parser.(*parser).createNodeFromToken
0 0% 99.82% 45935.89MB 17.94% github.com/goccy/go-yaml/parser.(*parser).parse
0 0% 99.82% 255459.66MB 99.76% github.com/goccy/go-yaml/parser.(*parser).parseMapValue
0 0% 99.82% 255607.10MB 99.82% github.com/goccy/go-yaml/parser.(*parser).parseMappingValue
0 0% 99.82% 255607.10MB 99.82% github.com/goccy/go-yaml/parser.(*parser).parseToken
0 0% 99.82% 45905.61MB 17.93% github.com/goccy/go-yaml/parser.Parse
i.e. over 250GB of allocations. Bisecting between 1.9.2 and 1.9.3 provides the commit that introduced this issue: https://github.com/goccy/go-yaml/commit/afed407
At a first glance, that was my suspicious commit also.
Is there any news on this? Shall we test with latest?
Looks like with 1.11.0 this is not a problem anymore. I don't know about memory usage.
❯ time ./go-ftw check -d ../coreruleset/tests
ftw/check: checked 285 files, everything looks good!
./go-ftw check -d ../coreruleset/tests 0.40s user 0.08s system 66% cpu 0.723 total
Well, after some additional testing by the team, this hasn't gone away yet. Don't know why it was working before, but this is a really hitting us.
❯ go mod tidy
go: downloading github.com/goccy/go-yaml v1.9.1
❯ go build
❯ time ./go-ftw check -d ../coreruleset/tests
ftw/check: checked 290 files, everything looks good!
./go-ftw check -d ../coreruleset/tests 0.27s user 0.04s system 52% cpu 0.587 total
❯ vim go.mod
❯ go mod tidy
❯ go build
❯ time ./go-ftw check -d ../coreruleset/tests
ftw/check: checked 290 files, everything looks good!
./go-ftw check -d ../coreruleset/tests 14.39s user 0.62s system 347% cpu 4.317 total
I've tried most of the versions up to latest 1.11.2, and this is taking ages to process.
@goccy Can you take a look at this one? Seems like this performance penalty is just preventing us to upgrade to latest and greatest.
ping
@fzipi @mcdee @theseion
I created a PR to fix some of the memory allocation issue. This in turn also allows the parser to run faster. Can you test my change and see if it helps with your situation?
I use the replace directive when doing my test:
replace github.com/goccy/go-yaml => github.com/yoelsusanto/go-yaml v0.0.0-20240324162521-2018c1ab915b
I ran a few benchmarks using testing.Benchmark
.
v1.9.2:
1000000000 0.1685 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/coreruleset/go-ftw/cmd 2.196s
v1.9.3
1 4119779167 ns/op 17488211488 B/op 4407338 allocs/op
PASS
ok github.com/coreruleset/go-ftw/cmd 4.289s
v1.11.3:
1 4094335833 ns/op 17486555776 B/op 4408680 allocs/op
PASS
ok github.com/coreruleset/go-ftw/cmd 4.281s
v1.11.3-patched:
1000000000 0.2176 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/coreruleset/go-ftw/cmd 3.325s
As you can see, the patch clearly has the desired effect, even if time per operation isn't at the level of v1.9.2. For us that is good enough. Thanks @yoelsusanto!
We found out that after upgrading to 1.9.6 there was a big hit in reading ~~ 270 files. After doing some version comparisions, the problem seems to be introduced in v1.9.3.
See https://github.com/coreruleset/go-ftw/issues/115. Right now we downgraded to pre 1.9.3, but probably we want to check out what the problem is and write some benchmarks.