cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
5.01k stars 283 forks source link

evaluator: use of a reference incurs a substantial performance hit #2803

Open marxarelli opened 7 months ago

marxarelli commented 7 months ago

What version of CUE are you using (cue version)?

$ grep cuelang go.mod
require cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0

Does this issue reproduce with the latest stable release?

Yes.

What did you do?

I introduced a simple field reference into my CUE values and experienced a substantial performance hit (~20 fold). I've narrowed down the case to the following common schema and differing values (I left enough of my original schema to illustrate the difference). These two benchmarks illustrate the strange performance penalty incurred by the use of a field reference.

-- go.mod --
module cue.example/reference-performance

go 1.21

require cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0

require (
    github.com/cockroachdb/apd/v3 v3.2.1 // indirect
    github.com/google/uuid v1.2.0 // indirect
    github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de // indirect
    golang.org/x/net v0.19.0 // indirect
    golang.org/x/text v0.14.0 // indirect
    gopkg.in/yaml.v3 v3.0.1 // indirect
)
-- go.sum --
cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0 h1:WT1b9NFC7omcyCVuAsLz6JiOkpM+nkcApS8YCkWcgMM=
cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0/go.mod h1:yBUgoS5sUpd10S4FtLFiCagmQALmMhm5GLpYngbK4zk=
github.com/cockroachdb/apd/v3 v3.2.1 h1:U+8j7t0axsIgvQUqthuNm82HIrYXodOV2iWLWtEaIwg=
github.com/cockroachdb/apd/v3 v3.2.1/go.mod h1:klXJcjp+FffLTHlhIG69tezTDvdP065naDsHzKhYSqc=
github.com/emicklei/proto v1.10.0 h1:pDGyFRVV5RvV+nkBK9iy3q67FBy9Xa7vwrOTE+g5aGw=
github.com/emicklei/proto v1.10.0/go.mod h1:rn1FgRS/FANiZdD2djyH7TMA9jdRDcYQ9IEN9yvjX0A=
github.com/go-quicktest/qt v1.101.0 h1:O1K29Txy5P2OK0dGo59b7b0LR6wKfIhttaAhHUyn7eI=
github.com/go-quicktest/qt v1.101.0/go.mod h1:14Bz/f7NwaXPtdYEgzsx46kqSxVwTbzVZsDC26tQJow=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/uuid v1.2.0 h1:qJYtXnJRWmpe7m/3XlyhrsLrEURqHRM2kxzoxXqyUDs=
github.com/google/uuid v1.2.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/lib/pq v1.10.7 h1:p7ZhMD+KsSRozJr34udlUrhboJwWAgCg34+/ZZNvZZw=
github.com/lib/pq v1.10.7/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=
github.com/mitchellh/go-wordwrap v1.0.1 h1:TLuKupo69TCn6TQSyGxwI1EblZZEsQ0vMlAFQflz0v0=
github.com/mitchellh/go-wordwrap v1.0.1/go.mod h1:R62XHJLzvMFRBbcrT7m7WgmE1eOyTSsCt+hzestvNj0=
github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de h1:D5x39vF5KCwKQaw+OC9ZPiLVHXz3UFw2+psEX+gYcto=
github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de/go.mod h1:kJun4WP5gFuHZgRjZUWWuH1DTxCtxbHDOIJsudS8jzY=
github.com/protocolbuffers/txtpbfmt v0.0.0-20230328191034-3462fbc510c0 h1:sadMIsgmHpEOGbUs6VtHBXRR1OHevnj7hLx9ZcdNGW4=
github.com/protocolbuffers/txtpbfmt v0.0.0-20230328191034-3462fbc510c0/go.mod h1:jgxiZysxFPM+iWKwQwPR+y+Jvo54ARd4EisXxKYpB5c=
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
golang.org/x/mod v0.14.0 h1:dGoOF9QVLYng8IHTm7BAyWqCqSheQ5pYWGhzW00YJr0=
golang.org/x/mod v0.14.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/net v0.19.0 h1:zTwKpTd2XuCqf8huc7Fo2iSy+4RHPd10s4KzeTnVr1c=
golang.org/x/net v0.19.0/go.mod h1:CfAk/cbD4CthTvqiEl8NpboMuiuOYsAr/7NOjZJtv1U=
golang.org/x/text v0.14.0 h1:ScX5w1eTa3QqT8oi6+ziP7dTV1S2+ALU0bI+0zXKWiQ=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/tools v0.16.1 h1:TLyB3WofjdOEepBHAU20JdNC1Zbg87elYofWYAY5oZA=
golang.org/x/tools v0.16.1/go.mod h1:kYVVN6I1mBNoB1OX+noeBjbRk4IUEPa7JJ+TJMEooJ0=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
-- reference_performance_hit_test.go --
package main

import (
    "testing"

    "cuelang.org/go/cue/cuecontext"
)

const schema = `
import (
    "list"
)

state: {
    #Chain: [#Source, ...#Op]

    #ChainRefPattern: "^[a-zA-Z_-][a-zA-Z0-9_-]*$"

    #ChainRef: =~#ChainRefPattern

    #Chains: {
        [=~#ChainRefPattern]: #Chain
    }

    #SharedKeyHint: {
        sharedKeyHint!: string
    }

    #Op: {
        #Run | #With
    }

    #Option: {
        #WorkingDirectory
    }

    #Run: {
        run!: string
        arguments?: [string, ...string]
        options?: #RunOption | [#RunOption, ...#RunOption]
        if options != _|_ {
            optionsValue: list.FlattenN([options], 1)
        }
    }

    #RunOption: {
        #CacheMount | #SourceMount |
        #TmpFSMount | #ReadOnly |
        #Option
    }

    #CacheMount: {
        cache!: string
        access: *"shared" | "private" | "locked"
    }

    #SourceMount: {
        mount!: string
        from!:   *null | #Chain
        source: string | *"/"
    }

    #TmpFSMount: {
        tmpfs!: string
        size:   uint64
    }

    #ReadOnly: {
        readOnly!: *true | false
    }

    #Scratch: {
        scratch!: true
    }

    #Source: {
        #Scratch
    }

    #State: {
        #Source | #Op
    }

    #With: {
        with!: #Option | [#Option, ...#Option]
        withValue: list.FlattenN([with], 1)
    }

    #WorkingDirectory: {
        directory!: string
    }
}
    `

func BenchmarkWithoutReference(b *testing.B) {
    src := schema + `
chains: state.#Chains
chains: {
    foo: [
        { scratch: true },
        { with: directory: "/src" },
    ]
    bar: [
        { scratch: true },
        { with: directory: "/src" },
    ]
}`

    for i := 0; i < b.N; i++ {
        val := cuecontext.New().CompileString(src)
        err := val.Err()
        if err != nil {
            b.Error(err)
        }
    }
}

func BenchmarkWithReference(b *testing.B) {
    src := schema + `
chains: state.#Chains
chains: {
    foo: [
        { scratch: true },
        { with: directory: "/src" },
    ]
    bar: foo
}`

    for i := 0; i < b.N; i++ {
        val := cuecontext.New().CompileString(src)
        err := val.Err()
        if err != nil {
            b.Error(err)
        }
    }
}
$ go test ./reference_performance_hit_test.go -test.bench=. -test.benchmem
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkWithoutReference-4          816           1364573 ns/op          537393 B/op       8564 allocs/op
BenchmarkWithReference-4              49          23182309 ns/op         7547367 B/op     127824 allocs/op
PASS
ok      command-line-arguments  3.705s

What did you expect to see?

Similar performance between the two benchmark examples, the use of a reference and repeating the same literal values.

What did you see instead?

The use of a reference causes the compilation to be ~ 17 times slower. (The difference was even more drastic with my full schema.)

marxarelli commented 7 months ago

Here are CPU and memory profiles of each benchmark as well. Lots of time spent in runtime.madvise and heap allocation in BenchmarkWithReference which I guess explains the big difference in both allocs/ops and cpu time.

$ go test ./reference_performance_hit_test.go -test.bench=WithReference -test.benchmem -cpuprofile cpu-withref.out
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkWithReference-4              50          32446582 ns/op         7547144 B/op     127820 allocs/op
PASS
ok      command-line-arguments  3.185s

$ go test ./reference_performance_hit_test.go -test.bench=WithoutReference -test.benchmem -cpuprofile cpu-withoutref.out
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkWithoutReference-4          806           1392781 ns/op          537588 B/op       8564 allocs/op
PASS
ok      command-line-arguments  1.686s

$ go tool pprof -top -cum cpu-withref.out | head -n 20
Type: cpu
Time: Feb 9, 2024 at 4:30pm (PST)
Duration: 2.81s, Total samples = 2.66s (94.52%)
Showing nodes accounting for 2.44s, 91.73% of 2.66s total
Dropped 95 nodes (cum <= 0.01s)
      flat  flat%   sum%        cum   cum%
         0     0%     0%      1.69s 63.53%  runtime.systemstack
     1.11s 41.73% 41.73%      1.11s 41.73%  runtime.madvise
         0     0% 41.73%      1.06s 39.85%  runtime.(*mheap).allocSpan
         0     0% 41.73%      1.06s 39.85%  runtime.sysUsed (inline)
         0     0% 41.73%      1.06s 39.85%  runtime.sysUsedOS (inline)
         0     0% 41.73%      1.02s 38.35%  runtime.(*mheap).alloc.func1
         0     0% 41.73%      0.55s 20.68%  cuelang.org/go/internal/core/adt.(*OpContext).unify
         0     0% 41.73%      0.55s 20.68%  cuelang.org/go/internal/core/adt.(*Vertex).Finalize
         0     0% 41.73%      0.55s 20.68%  cuelang.org/go/internal/core/adt.(*nodeContext).completeArcs
     0.01s  0.38% 42.11%      0.55s 20.68%  cuelang.org/go/internal/core/adt.(*nodeContext).expandDisjuncts
         0     0% 42.11%      0.55s 20.68%  cuelang.org/go/internal/core/adt.(*nodeContext).postDisjunct
     0.01s  0.38% 42.48%      0.53s 19.92%  cuelang.org/go/cue.newVertexRoot
         0     0% 42.48%      0.52s 19.55%  cuelang.org/go/cue.newValueRoot
         0     0% 42.48%      0.51s 19.17%  cuelang.org/go/cue.(*Context).make

$ go tool pprof -top -cum cpu-withoutref.out | head -n 20
Type: cpu
Time: Feb 9, 2024 at 4:30pm (PST)
Duration: 1.41s, Total samples = 1.32s (93.61%)
Showing nodes accounting for 1.32s, 100% of 1.32s total
      flat  flat%   sum%        cum   cum%
         0     0%     0%      0.72s 54.55%  command-line-arguments.BenchmarkWithoutReference
         0     0%     0%      0.71s 53.79%  cuelang.org/go/cue.(*Context).CompileString
         0     0%     0%      0.69s 52.27%  testing.(*B).runN
         0     0%     0%      0.68s 51.52%  testing.(*B).launch
     0.01s  0.76%  0.76%      0.61s 46.21%  cuelang.org/go/internal/core/adt.(*OpContext).unify
         0     0%  0.76%      0.61s 46.21%  cuelang.org/go/internal/core/adt.(*Vertex).Finalize
     0.01s  0.76%  1.52%      0.61s 46.21%  cuelang.org/go/internal/core/adt.(*nodeContext).completeArcs
     0.01s  0.76%  2.27%      0.61s 46.21%  cuelang.org/go/internal/core/adt.(*nodeContext).expandDisjuncts
     0.01s  0.76%  3.03%      0.61s 46.21%  cuelang.org/go/internal/core/adt.(*nodeContext).postDisjunct
         0     0%  3.03%      0.60s 45.45%  cuelang.org/go/cue.newValueRoot
         0     0%  3.03%      0.60s 45.45%  cuelang.org/go/cue.newVertexRoot
         0     0%  3.03%      0.59s 44.70%  cuelang.org/go/cue.(*Context).compile
         0     0%  3.03%      0.59s 44.70%  cuelang.org/go/cue.(*Context).make
         0     0%  3.03%      0.54s 40.91%  runtime.systemstack
     0.02s  1.52%  4.55%      0.40s 30.30%  runtime.mallocgc

$ go test ./reference_performance_hit_test.go -test.bench=WithReference -test.benchmem -memprofile mem-withref.out
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkWithReference-4              45          26692517 ns/op         7546921 B/op     127821 allocs/op
PASS
ok      command-line-arguments  1.647s

$ go test ./reference_performance_hit_test.go -test.bench=WithoutReference -test.benchmem -memprofile mem-withoutref.out
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkWithoutReference-4          820           1326363 ns/op          537366 B/op       8563 allocs/op
PASS
ok      command-line-arguments  1.514s

$ go tool pprof -top -cum mem-withref.out | head -n 40
Type: alloc_space
Time: Feb 9, 2024 at 4:40pm (PST)
Showing nodes accounting for 324.88MB, 94.47% of 343.88MB total
Dropped 80 nodes (cum <= 1.72MB)
      flat  flat%   sum%        cum   cum%
         0     0%     0%   336.88MB 97.96%  cuelang.org/go/internal/core/adt.(*OpContext).unify
         0     0%     0%   336.88MB 97.96%  cuelang.org/go/internal/core/adt.(*nodeContext).completeArcs
       1MB  0.29%  0.29%   336.88MB 97.96%  cuelang.org/go/internal/core/adt.(*nodeContext).expandDisjuncts
         0     0%  0.29%   336.88MB 97.96%  cuelang.org/go/internal/core/adt.(*nodeContext).postDisjunct
         0     0%  0.29%   179.06MB 52.07%  cuelang.org/go/internal/core/adt.(*Vertex).Finalize
         0     0%  0.29%   123.05MB 35.78%  cuelang.org/go/internal/core/adt.(*Vertex).IsDefined (inline)
         0     0%  0.29%   117.51MB 34.17%  cuelang.org/go/internal/core/adt.(*nodeContext).disjunctError
         0     0%  0.29%   117.51MB 34.17%  cuelang.org/go/internal/core/adt.(*nodeContext).makeError
         0     0%  0.29%   114.01MB 33.15%  cuelang.org/go/cue/errors.(*list).RemoveMultiples
         0     0%  0.29%   114.01MB 33.15%  cuelang.org/go/cue/errors.Sanitize
         0     0%  0.29%   114.01MB 33.15%  cuelang.org/go/cue/errors.list.sanitize (inline)
    0.50MB  0.15%  0.44%    83.01MB 24.14%  cuelang.org/go/cue/errors.list.Sort (inline)
         0     0%  0.44%    82.51MB 23.99%  cuelang.org/go/cue/errors.list.Less
         0     0%  0.44%    82.51MB 23.99%  sort.Sort
         0     0%  0.44%    82.51MB 23.99%  sort.pdqsort
    3.50MB  1.02%  1.45%    79.01MB 22.97%  cuelang.org/go/cue/errors.String (inline)
         0     0%  1.45%    79.01MB 22.97%  cuelang.org/go/internal/core/adt.(*ValueError).Error
         0     0%  1.45%    79.01MB 22.97%  sort.insertionSort
         0     0%  1.45%    75.51MB 21.96%  cuelang.org/go/cue/errors.writeErr
   62.01MB 18.03% 19.49%    74.51MB 21.67%  cuelang.org/go/internal/core/adt.(*ValueError).Path
    0.50MB  0.15% 19.63%    71.01MB 20.65%  cuelang.org/go/internal/core/adt.(*nodeContext).addExprConjunct
         0     0% 19.63%    66.01MB 19.19%  cuelang.org/go/internal/core/adt.(*nodeContext).evalExpr
         0     0% 19.63%    50.35MB 14.64%  cuelang.org/go/internal/core/adt.(*nodeContext).checkClosed
         0     0% 19.63%    50.35MB 14.64%  cuelang.org/go/internal/core/adt.verifyArc2
   36.53MB 10.62% 30.25%    48.03MB 13.97%  cuelang.org/go/internal/core/adt.(*OpContext).NewPosf
         0     0% 30.25%    39.51MB 11.49%  cuelang.org/go/cue.newVertexRoot
       1MB  0.29% 30.54%    39.50MB 11.49%  cuelang.org/go/internal/core/adt.(*nodeContext).addVertexConjuncts
    3.50MB  1.02% 31.56%    36.50MB 10.61%  cuelang.org/go/internal/core/adt.(*nodeContext).addStruct
         0     0% 31.56%       34MB  9.89%  cuelang.org/go/internal/core/adt.(*nodeContext).insertConjuncts
    0.50MB  0.15% 31.71%       33MB  9.60%  cuelang.org/go/cue.newValueRoot
         0     0% 31.71%       31MB  9.02%  cuelang.org/go/cue/errors.approximateEqual
         0     0% 31.71%    28.53MB  8.30%  cuelang.org/go/internal/core/adt.(*OpContext).Newf
   28.50MB  8.29% 40.00%    28.50MB  8.29%  cuelang.org/go/internal/core/adt.clone
       2MB  0.58% 40.58%    28.03MB  8.15%  cuelang.org/go/internal/core/adt.(*OpContext).NewErrf (inline)
    1.50MB  0.44% 41.01%    27.50MB  8.00%  cuelang.org/go/internal/core/adt.(*nodeContext).reportConflict

$ go tool pprof -top -cum mem-withoutref.out | head -n 40
Type: alloc_space
Time: Feb 9, 2024 at 4:40pm (PST)
Showing nodes accounting for 444.17MB, 93.67% of 474.17MB total
Dropped 90 nodes (cum <= 2.37MB)
      flat  flat%   sum%        cum   cum%
         0     0%     0%   360.15MB 75.95%  cuelang.org/go/internal/core/adt.(*OpContext).unify
         0     0%     0%   356.65MB 75.21%  cuelang.org/go/internal/core/adt.(*nodeContext).completeArcs
    3.50MB  0.74%  0.74%   356.65MB 75.21%  cuelang.org/go/internal/core/adt.(*nodeContext).expandDisjuncts
         0     0%  0.74%   356.65MB 75.21%  cuelang.org/go/internal/core/adt.(*nodeContext).postDisjunct
         0     0%  0.74%   307.07MB 64.76%  cuelang.org/go/internal/core/adt.(*Vertex).Finalize
    1.50MB  0.32%  1.05%   232.06MB 48.94%  cuelang.org/go/internal/core/adt.(*nodeContext).addExprConjunct
         0     0%  1.05%   220.56MB 46.51%  cuelang.org/go/internal/core/adt.(*nodeContext).evalExpr
         0     0%  1.05%   214.04MB 45.14%  cuelang.org/go/cue.(*Context).CompileString
         0     0%  1.05%   198.53MB 41.87%  cuelang.org/go/cue.newVertexRoot
         0     0%  1.05%   195.54MB 41.24%  command-line-arguments.BenchmarkWithoutReference
       2MB  0.42%  1.48%   191.53MB 40.39%  cuelang.org/go/cue.newValueRoot
         0     0%  1.48%   177.54MB 37.44%  testing.(*B).runN
         0     0%  1.48%   170.54MB 35.96%  testing.(*B).launch
         0     0%  1.48%   153.52MB 32.38%  cuelang.org/go/cue.(*Context).make
         0     0%  1.48%   142.02MB 29.95%  cuelang.org/go/cue.(*Context).compile
         0     0%  1.48%   130.53MB 27.53%  cuelang.org/go/internal/core/adt.(*nodeContext).insertConjuncts
         0     0%  1.48%   104.02MB 21.94%  cuelang.org/go/internal/core/adt.(*OpContext).evalState
         0     0%  1.48%    97.52MB 20.57%  cuelang.org/go/internal/core/adt.(*CallExpr).evaluate
         0     0%  1.48%    94.02MB 19.83%  cuelang.org/go/internal/core/adt.(*OpContext).evaluateRec
    6.50MB  1.37%  2.85%    84.51MB 17.82%  cuelang.org/go/internal/core/adt.(*nodeContext).addStruct
         0     0%  2.85%    83.52MB 17.61%  cuelang.org/go/internal/core/runtime.(*Runtime).Compile
       5MB  1.05%  3.90%    72.01MB 15.19%  cuelang.org/go/internal/core/adt.(*nodeContext).addVertexConjuncts
         0     0%  3.90%    70.02MB 14.77%  cuelang.org/go/cue/build.(*Instance).AddFile
      11MB  2.32%  6.22%    67.01MB 14.13%  cuelang.org/go/cue/parser.(*parser).parseField
    2.50MB  0.53%  6.75%    67.01MB 14.13%  cuelang.org/go/cue/parser.(*parser).parseFieldList
         0     0%  6.75%    65.01MB 13.71%  cuelang.org/go/cue/parser.ParseFile
         0     0%  6.75%    65.01MB 13.71%  cuelang.org/go/cue/parser.(*parser).parseBinaryExpr
         0     0%  6.75%    65.01MB 13.71%  cuelang.org/go/cue/parser.(*parser).parseExpr
         0     0%  6.75%    64.01MB 13.50%  cuelang.org/go/cue/parser.(*parser).parseRHS
       1MB  0.21%  6.96%    63.51MB 13.39%  cuelang.org/go/cue/parser.(*parser).parseOperand
         0     0%  6.96%    63.51MB 13.39%  cuelang.org/go/cue/parser.(*parser).parsePrimaryExpr
       1MB  0.21%  7.17%    63.51MB 13.39%  cuelang.org/go/cue/parser.(*parser).parseUnaryExpr
       1MB  0.21%  7.38%    63.01MB 13.29%  cuelang.org/go/cue/parser.(*parser).parseStruct
         0     0%  7.38%    63.01MB 13.29%  cuelang.org/go/cue/parser.(*parser).parseStructBody
         0     0%  7.38%    60.51MB 12.76%  cuelang.org/go/cue/build.(*Instance).parse
marxarelli commented 6 months ago

I was able to reduce the test case quite a bit more.

-- go.mod --
module cue.example/reference-performance

go 1.21

require cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0

require (
    github.com/cockroachdb/apd/v3 v3.2.1 // indirect
    github.com/google/uuid v1.2.0 // indirect
    github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de // indirect
    golang.org/x/net v0.19.0 // indirect
    golang.org/x/text v0.14.0 // indirect
    gopkg.in/yaml.v3 v3.0.1 // indirect
)
-- go.sum --
cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0 h1:WT1b9NFC7omcyCVuAsLz6JiOkpM+nkcApS8YCkWcgMM=
cuelang.org/go v0.7.1-0.20240209203336-3d1c0cb3f5e0/go.mod h1:yBUgoS5sUpd10S4FtLFiCagmQALmMhm5GLpYngbK4zk=
github.com/cockroachdb/apd/v3 v3.2.1 h1:U+8j7t0axsIgvQUqthuNm82HIrYXodOV2iWLWtEaIwg=
github.com/cockroachdb/apd/v3 v3.2.1/go.mod h1:klXJcjp+FffLTHlhIG69tezTDvdP065naDsHzKhYSqc=
github.com/emicklei/proto v1.10.0 h1:pDGyFRVV5RvV+nkBK9iy3q67FBy9Xa7vwrOTE+g5aGw=
github.com/emicklei/proto v1.10.0/go.mod h1:rn1FgRS/FANiZdD2djyH7TMA9jdRDcYQ9IEN9yvjX0A=
github.com/go-quicktest/qt v1.101.0 h1:O1K29Txy5P2OK0dGo59b7b0LR6wKfIhttaAhHUyn7eI=
github.com/go-quicktest/qt v1.101.0/go.mod h1:14Bz/f7NwaXPtdYEgzsx46kqSxVwTbzVZsDC26tQJow=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/uuid v1.2.0 h1:qJYtXnJRWmpe7m/3XlyhrsLrEURqHRM2kxzoxXqyUDs=
github.com/google/uuid v1.2.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/lib/pq v1.10.7 h1:p7ZhMD+KsSRozJr34udlUrhboJwWAgCg34+/ZZNvZZw=
github.com/lib/pq v1.10.7/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=
github.com/mitchellh/go-wordwrap v1.0.1 h1:TLuKupo69TCn6TQSyGxwI1EblZZEsQ0vMlAFQflz0v0=
github.com/mitchellh/go-wordwrap v1.0.1/go.mod h1:R62XHJLzvMFRBbcrT7m7WgmE1eOyTSsCt+hzestvNj0=
github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de h1:D5x39vF5KCwKQaw+OC9ZPiLVHXz3UFw2+psEX+gYcto=
github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de/go.mod h1:kJun4WP5gFuHZgRjZUWWuH1DTxCtxbHDOIJsudS8jzY=
github.com/protocolbuffers/txtpbfmt v0.0.0-20230328191034-3462fbc510c0 h1:sadMIsgmHpEOGbUs6VtHBXRR1OHevnj7hLx9ZcdNGW4=
github.com/protocolbuffers/txtpbfmt v0.0.0-20230328191034-3462fbc510c0/go.mod h1:jgxiZysxFPM+iWKwQwPR+y+Jvo54ARd4EisXxKYpB5c=
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
golang.org/x/mod v0.14.0 h1:dGoOF9QVLYng8IHTm7BAyWqCqSheQ5pYWGhzW00YJr0=
golang.org/x/mod v0.14.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/net v0.19.0 h1:zTwKpTd2XuCqf8huc7Fo2iSy+4RHPd10s4KzeTnVr1c=
golang.org/x/net v0.19.0/go.mod h1:CfAk/cbD4CthTvqiEl8NpboMuiuOYsAr/7NOjZJtv1U=
golang.org/x/text v0.14.0 h1:ScX5w1eTa3QqT8oi6+ziP7dTV1S2+ALU0bI+0zXKWiQ=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/tools v0.16.1 h1:TLyB3WofjdOEepBHAU20JdNC1Zbg87elYofWYAY5oZA=
golang.org/x/tools v0.16.1/go.mod h1:kYVVN6I1mBNoB1OX+noeBjbRk4IUEPa7JJ+TJMEooJ0=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
-- reference_performance_hit_test.go --
package main

import (
    "testing"

    "cuelang.org/go/cue/cuecontext"
)

var schema = `
#Chain: [...(#Op1 | #Op2)]

#Op1: {
    op1: string
}

#Op2: {
    op2: string
}
`

func benchmark(b *testing.B, src string) {
    b.Helper()

    for i := 0; i < b.N; i++ {
        val := cuecontext.New().CompileString(src)
        err := val.Err()
        if err != nil {
            b.Error(err)
        }
    }
}

func BenchmarkConjunctWithoutReference(b *testing.B) {
    benchmark(b, schema+`
        foo: #Chain & [
            { op1: "one" },
            { op2: "two" },
        ]

        bar: #Chain & [
            { op1: "one" },
            { op2: "two" },
        ]`,
    )
}

func BenchmarkConjunctWithReference(b *testing.B) {
    benchmark(b, schema+`
        foo: #Chain & [
            { op1: "one" },
            { op2: "two" },
        ]

        bar: #Chain & foo`,
    )
}

func BenchmarkSimpleReference(b *testing.B) {
    benchmark(b, schema+`
        foo: #Chain & [
            { op1: "one" },
            { op2: "two" },
        ]

        bar: foo`,
    )
}
$ go test -bench=. -benchmem ./reference_performance_hit_test.go
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkConjunctWithoutReference-4        12111            105184 ns/op           56720 B/op        771 allocs/op
BenchmarkConjunctWithReference-4            9166            139163 ns/op           74776 B/op        960 allocs/op
BenchmarkSimpleReference-4                 13618             92896 ns/op           51241 B/op        691 allocs/op
PASS
ok      command-line-arguments  6.465s

The problem is much less pronounced with the minimal test case but still what seems most curious to me is the difference between bar: #Chain & foo in BenchmarkConjunctWithReference and bar: foo in BenchmarkSimpleReference. Considering foo in both examples is already a conjunct of #Chain, it seems strange (at least from an abstract/logical standpoint) that there would be any difference at all.

Also notable, this issue only seems to arise when #Chain is a list of disjuncts. When it is a list of a single non-sum definition, I don't see a difference.

@myitcv I know you mentioned on Slack that pprof is not very helpful here, but I thought I would include this pprof diff graph just in case as I do see a lot more CPU time spent in adt.(*nodeContext).expandDisjuncts for the BenchmarkConjunctWithReference benchmark, though admittedly, I am a bit lost here.

pprof-cpu-diff-BenchmarkSimpleReference-vs-BenchmarkConjunctWithReference

The graph was produced with:

$ go test -cpuprofile simple-ref-cpu.prof -bench=BenchmarkSimpleReference ./reference_performance_hit_test.go
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkSimpleReference-4         14245             86266 ns/op
PASS
ok      command-line-arguments  2.468s
$ go test -cpuprofile conjunct-ref-cpu.prof -bench=BenchmarkConjunctWithReference ./reference_performance_hit_test.go
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
BenchmarkConjunctWithReference-4            8438            133349 ns/op
PASS
ok      command-line-arguments  2.278s
$ go tool pprof -diff_base simple-ref-cpu.prof -web conjunct-ref-cpu.prof
renanay commented 5 months ago

I just encountered the same issue. Is this going to be fixed in next versions?

myitcv commented 5 months ago

Thanks @marxarelli for raising. Per the recent performance calls, this falls under https://github.com/cue-lang/cue/issues/2852. I've marked it as a linked issue from there to revisit.

You can also see from the Evaluator roadmap, that Marcel is working on this particular problem as part of the new evaluator.

You can subscribe to https://github.com/cue-lang/cue/issues/2850 for updates on the progress of the evaluator. We are starting to post these once a fortnight.