open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.2k stars 1.05k forks source link

errors.Is and errors.As overhead on hot path #5551

Closed pellared closed 3 months ago

pellared commented 3 months ago

Have you checked the performance overhead? While I have nothing against using errors.Is and errors.As in general I am a uncertain if we should use it here (it looks like a hot path and we control the result of c.scopeInfo). As far as I remember errors.Is uses some kind of reflection (internal/reflectlight).

CC @dashpole

_Originally posted by @pellared in https://github.com/open-telemetry/opentelemetry-go/pull/5535#discussion_r1654691798_

dmathieu commented 3 months ago

This is the benchstat for the whole of prometheus:

pkg: go.opentelemetry.io/otel/exporters/prometheus
                │ bench-main  │            bench-branch            │
                │   sec/op    │   sec/op     vs base               │
Collect1-10       8.960µ ± 2%   8.988µ ± 2%       ~ (p=0.592 n=10)
Collect10-10      24.62µ ± 1%   24.71µ ± 1%       ~ (p=0.853 n=10)
Collect100-10     151.4µ ± 1%   150.5µ ± 1%       ~ (p=0.089 n=10)
Collect1000-10    1.656m ± 1%   1.656m ± 0%       ~ (p=0.853 n=10)
Collect10000-10   15.53m ± 4%   15.46m ± 0%       ~ (p=0.075 n=10)
geomean           243.7µ        243.5µ       -0.08%

                │  bench-main  │            bench-branch             │
                │     B/op     │     B/op      vs base               │
Collect1-10       34.62Ki ± 0%   34.62Ki ± 0%       ~ (p=1.000 n=10)
Collect10-10      46.87Ki ± 0%   46.87Ki ± 0%       ~ (p=1.000 n=10)
Collect100-10     172.9Ki ± 0%   172.9Ki ± 0%       ~ (p=0.724 n=10)
Collect1000-10    1.453Mi ± 0%   1.453Mi ± 0%       ~ (p=0.644 n=10)
Collect10000-10   13.93Mi ± 0%   13.93Mi ± 0%       ~ (p=0.579 n=10)
geomean           358.9Ki        358.9Ki       -0.00%

                │ bench-main  │             bench-branch             │
                │  allocs/op  │  allocs/op   vs base                 │
Collect1-10        69.00 ± 0%    69.00 ± 0%       ~ (p=1.000 n=10) ¹
Collect10-10       386.0 ± 0%    386.0 ± 0%       ~ (p=1.000 n=10) ¹
Collect100-10     3.559k ± 0%   3.559k ± 0%       ~ (p=1.000 n=10)
Collect1000-10    35.15k ± 0%   35.15k ± 0%       ~ (p=1.000 n=10)
Collect10000-10   351.0k ± 0%   351.0k ± 0%       ~ (p=0.617 n=10)
geomean           4.108k        4.108k       -0.00%
¹ all samples are equal

With main being the version using errors.Is, while branch is the version where that commit is reverted.

As a something relative, I've also tried benchmarks with only both syntaxes:

package main

import (
    "errors"
    "testing"
)

func BenchmarkErrorIs(b *testing.B) {
    b.ResetTimer()
    b.ReportAllocs()

    err := errors.New("test")
    err2 := errors.New("test2")

    for i := 0; i < b.N; i++ {
        //errors.Is(err, err2)
        _ = (err == err2)
    }
}
           │  bench-main  │          bench-branch           │
           │    sec/op    │    sec/op     vs base           │
ErrorIs-10   6.896n ± ∞ ¹   2.137n ± ∞ ¹  ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

           │ bench-main  │          bench-branch          │
           │    B/op     │    B/op      vs base           │
ErrorIs-10   0.000 ± ∞ ¹   0.000 ± ∞ ¹  ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

           │ bench-main  │          bench-branch          │
           │  allocs/op  │  allocs/op   vs base           │
ErrorIs-10   0.000 ± ∞ ¹   0.000 ± ∞ ¹  ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

Again, main is using errors.Is.

So there does seem to be a difference, with ~7ns for errors.Is, and ~2ns when making a standard comparison. While taken like this, the difference seems huge, I'm not sure 4ns really is a problem.

pellared commented 3 months ago

Then, I assume internal/reflectlight is very optimized (not checked the code). Closing.