This PR makes the DistinctString and ScopedDistinctString collectors go faster by not tracking diff in cases where we don't use the diff
We made similar performance improvements in DistinctValue collector in https://github.com/grafana/tempo/pull/4104, and now making the similar performance improvements in the DistinctString and ScopedDistinctString collectors in this PR.
how fast?
DistinctString collector is around 54.93% faster with 43% improvements in mem. usage.
ScopedDistinctString collector is from 27% to 49% faster with 45% improvements in mem. usage.
What this PR does:
This PR makes the DistinctString and ScopedDistinctString collectors go faster by not tracking diff in cases where we don't use the diff
We made similar performance improvements in DistinctValue collector in https://github.com/grafana/tempo/pull/4104, and now making the similar performance improvements in the DistinctString and ScopedDistinctString collectors in this PR.
how fast?
Benchmarks:
DistinctString collector
```text goos: darwin goarch: amd64 pkg: github.com/grafana/tempo/pkg/collector cpu: VirtualApple @ 2.50GHz │ BenchmarkDistinctStringCollect_main.txt │ BenchmarkDistinctStringCollect_fast.txt │ │ sec/op │ sec/op vs base │ DistinctStringCollect/uniques_limit:0 29.57µ ± 1% 13.23µ ± 0% -55.25% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:0 29.86µ ± 0% 13.45µ ± 0% -54.94% (p=0.000 n=10) DistinctStringCollect/uniques_limit:100000 29.48µ ± 0% 13.29µ ± 0% -54.93% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:100000 29.85µ ± 0% 13.45µ ± 1% -54.94% (p=0.000 n=10) DistinctStringCollect/uniques_limit:1000000 29.48µ ± 0% 13.40µ ± 0% -54.53% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:1000000 29.85µ ± 0% 13.55µ ± 0% -54.60% (p=0.000 n=10) DistinctStringCollect/uniques_limit:10000000 29.50µ ± 0% 13.39µ ± 0% -54.61% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:10000000 29.85µ ± 1% 13.45µ ± 0% -54.92% (p=0.000 n=10) geomean 29.68µ 13.40µ -54.84% │ BenchmarkDistinctStringCollect_main.txt │ BenchmarkDistinctStringCollect_fast.txt │ │ B/op │ B/op vs base │ DistinctStringCollect/uniques_limit:0 12.563Ki ± 0% 7.133Ki ± 0% -43.23% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:0 12.562Ki ± 0% 7.133Ki ± 0% -43.22% (p=0.000 n=10) DistinctStringCollect/uniques_limit:100000 12.563Ki ± 0% 7.133Ki ± 0% -43.23% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:100000 12.563Ki ± 0% 7.133Ki ± 0% -43.23% (p=0.000 n=10) DistinctStringCollect/uniques_limit:1000000 12.563Ki ± 0% 7.133Ki ± 0% -43.23% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:1000000 12.563Ki ± 0% 7.133Ki ± 0% -43.23% (p=0.000 n=10) DistinctStringCollect/uniques_limit:10000000 12.562Ki ± 0% 7.133Ki ± 0% -43.22% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:10000000 12.562Ki ± 0% 7.133Ki ± 0% -43.22% (p=0.000 n=10) geomean 12.56Ki 7.133Ki -43.22% │ BenchmarkDistinctStringCollect_main.txt │ BenchmarkDistinctStringCollect_fast.txt │ │ allocs/op │ allocs/op vs base │ DistinctStringCollect/uniques_limit:0 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:0 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/uniques_limit:100000 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:100000 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/uniques_limit:1000000 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:1000000 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/uniques_limit:10000000 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) DistinctStringCollect/duplicates_limit:10000000 121.0 ± 0% 112.0 ± 0% -7.44% (p=0.000 n=10) geomean 121.0 112.0 -7.44% ```ScopedDistinctString collector
```text goos: darwin goarch: amd64 pkg: github.com/grafana/tempo/pkg/collector cpu: VirtualApple @ 2.50GHz │ BenchmarkScopedDistinctStringCollect_main.txt │ BenchmarkScopedDistinctStringCollect_fast.txt │ │ sec/op │ sec/op vs base │ ScopedDistinctStringCollect/uniques_limit:0-11 707.2m ± 2% 357.7m ± 2% -49.42% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:0-11 110.45µ ± 0% 80.61µ ± 0% -27.02% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:100000-11 2.574m ± 0% 1.724m ± 0% -33.01% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:100000-11 110.26µ ± 1% 80.52µ ± 0% -26.97% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:1000000-11 26.13m ± 1% 17.13m ± 0% -34.45% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:1000000-11 109.42µ ± 0% 80.06µ ± 1% -26.84% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:10000000-11 642.3m ± 2% 323.8m ± 1% -49.59% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:10000000-11 109.49µ ± 0% 79.93µ ± 0% -27.00% (p=0.000 n=10) geomean 2.859m 1.858m -35.00% │ BenchmarkScopedDistinctStringCollect_main.txt │ BenchmarkScopedDistinctStringCollect_fast.txt │ │ B/op │ B/op vs base │ ScopedDistinctStringCollect/uniques_limit:0-11 178.78Mi ± 0% 97.02Mi ± 0% -45.73% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:0-11 47.43Ki ± 0% 25.71Ki ± 0% -45.80% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:100000-11 1495.6Ki ± 0% 826.9Ki ± 0% -44.71% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:100000-11 47.43Ki ± 0% 25.71Ki ± 0% -45.80% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:1000000-11 12.247Mi ± 0% 6.891Mi ± 0% -43.73% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:1000000-11 47.43Ki ± 0% 25.70Ki ± 0% -45.80% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:10000000-11 177.69Mi ± 0% 95.91Mi ± 0% -46.02% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:10000000-11 47.43Ki ± 0% 25.71Ki ± 0% -45.80% (p=0.000 n=10) geomean 1.128Mi 630.1Ki -45.43% │ BenchmarkScopedDistinctStringCollect_main.txt │ BenchmarkScopedDistinctStringCollect_fast.txt │ │ allocs/op │ allocs/op vs base │ ScopedDistinctStringCollect/uniques_limit:0-11 1.075M ± 0% 1.038M ± 0% -3.50% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:0-11 490.0 ± 0% 452.0 ± 0% -7.76% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:100000-11 10.63k ± 0% 10.38k ± 0% -2.29% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:100000-11 490.0 ± 0% 452.0 ± 0% -7.76% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:1000000-11 108.2k ± 0% 104.6k ± 0% -3.28% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:1000000-11 490.0 ± 0% 452.0 ± 0% -7.76% (p=0.000 n=10) ScopedDistinctStringCollect/uniques_limit:10000000-11 1003.2k ± 0% 965.4k ± 0% -3.77% (p=0.000 n=10) ScopedDistinctStringCollect/duplicates_limit:10000000-11 490.0 ± 0% 452.0 ± 0% -7.76% (p=0.000 n=10) geomean 9.589k 9.061k -5.51% ```Which issue(s) this PR fixes: Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]