opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.8k stars 1.82k forks source link

[BUG] Indexing performance degradation in main (3.x) #7918

Open bbarani opened 1 year ago

bbarani commented 1 year ago

Describe the bug I notice a relatively significant drop (~10%) in indexing performance in main branch for HTTP Logs workload possibly due to some changes introduced between May 16 2023 to May 18 2023. The mean indexing throughput was around 100k before the change but has come down to ~90k.

Screen Shot 2023-06-04 at 2 16 43 PM Screen Shot 2023-06-04 at 2 16 48 PM

Additional context Add any other context about the problem here.

andrross commented 1 year ago

Here's a 3 month view with a cropped y axis:

image

It's interesting that we saw a performance boost in early April that was never explained, and now it appears we may be returning to the previous baseline.

reta commented 1 year ago

Are these graphs (and in general, perf runs) available somewhere (in public)?

bbarani commented 1 year ago

@reta The above graphs were generated using internal runs. We are doing the final security review before surfacing the public performance dashboard. We should have some updates by this week. The public dashboard wont have historical data though.

andrross commented 1 year ago

A little more context here from previous discussions:

On April 12 we saw a big jump in performance on 3.0:

image

The most closely correlated commit (based on the timeline) was this ImmutableOpenMap change from @nknize. However, that was backported to the 2.x branch and we never saw the same performance gain.

One thing to note about the current change is that the 3.0.0 distribution builds are currently broken, and the last successful build is from May 17, at this commit. All the performance runs after May 15 in the above graphs are built from that same commit.

bbarani commented 1 year ago

Are these graphs (and in general, perf runs) available somewhere (in public)?

@reta The performance benchmarking page is live now at - http://opensearch.org/benchmarks

reta commented 1 year ago

@reta The performance benchmarking page is live now at - http://opensearch.org/benchmarks

@bbarani this is awesome, thank you, worth announcement on Slack! :rocket:

Bukhtawar commented 1 year ago

This is a really good addition, thanks @bbarani and team. I think it might be worth while to integrate https://github.com/async-profiler/async-profiler as well this will help us better understand code paths that cause additional CPU cycles or creates more garbage

bbarani commented 1 year ago

@Bukhtawar Sure, we will look in to it. @rishabh6788

In the meantime, we are looking to add out of the box telemetry devices as part of the OpenSearch benchmark tool as well.

opensearch-benchmark list telemetry

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

Available telemetry devices:

Command                     Name                        Description
--------------------------  --------------------------  --------------------------------------------------------------------------------------------------------------------------------
jit                         JIT Compiler Profiler       Enables JIT compiler logs.
gc                          GC log                      Enables GC logs.
jfr                         Flight Recorder             Enables Java Flight Recorder (requires an Oracle JDK or OpenJDK 11+)
heapdump                    Heap Dump                   Captures a heap dump.
node-stats                  Node Stats                  Regularly samples node stats
recovery-stats              Recovery Stats              Regularly samples shard recovery stats
ccr-stats                   CCR Stats                   Regularly samples Cross Cluster Replication (CCR) leader and follower(s) checkpoint at index leveland calculates replication lag
segment-stats               Segment Stats               Determines segment stats at the end of the benchmark.
transform-stats             Transform Stats             Regularly samples transform stats
searchable-snapshots-stats  Searchable Snapshots Stats  Regularly samples searchable snapshots stats

Keep in mind that each telemetry device may incur a runtime overhead which can skew results.
backslasht commented 1 year ago

@bbarani - From https://opensearch.org/benchmarks, I see the mean indexing throughput for arm64 is above 100k for last one month. Is this issue still exists?

image