opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.78k stars 1.82k forks source link

[Performance] Evaluate Document Parsing CPU Usage in Document Replication mode #9396

Open mgodwan opened 1 year ago

mgodwan commented 1 year ago

Is your feature request related to a problem? Please describe. Today, when a document is replicated across multiple shards during indexing, each shard is responsible for parsing the source document to create fields. This parsing can be avoided by letting primary pass the parsed document instead of the source document. This will save the CPU usage for IndexShard#prepareIndex step which happens during the indexing, and may provide better throughput with current document replication.

Describe the solution you'd like Explore if we can have some way to pass the parsed document to replica shards in document replication mode to reduce the work that replicas need to perform.

msfroh commented 1 year ago

@mgodwan -- The title of the issue says "evaluate", but then it proposes making the improvement.

Have you done the evaluation? If not, I think it would be relatively easy -- on a single node cluster, you can start generating indexing traffic, enable Java Flight Recorder, record for a few minutes, turn off JFR (or launch it with a specified recording time in the first place), then stop generating index traffic. From the recording, you can get a CPU flame graph to see how much time is spent in IndexShard#prepareIndex versus everything else.

My intuition is that a lot more work is probably done during the addDocs/updateDocs steps (e.g. running Analyzers), but if prepareIndex is expensive, then we could avoid doing it redundantly.

mgodwan commented 1 year ago

@msfroh I've already done some analysis. I will share the details.

In the past, we have seen some workloads(e.g. nyc taxis, http logs) taking upto 15-20% of CPU for parsing/prepareIndex

msfroh commented 1 year ago

In the past, we have seen some workloads(e.g. nyc taxis, http logs) taking up to 15-20% of CPU for parsing/prepareIndex

Wow! That sounds like this could be a nice, comparatively-easy win. Please do share those details -- we should probably prioritize this work.