I profiled an OpenSearch server running the term query operation from the big5 OSB workload. This query is very fast, so the intent was to find any overhead not related to doing the actual work of searching indexes that could be optimized. The surprising finding is that TaskResourceTrackingService takes a little over 7% of the total CPU cycles. A big chunk of that work is simply marshaling the TaskResourceInfo object to and from a JSON string (in getTaskResourceUsageFromThreadContext and writeTaskResourceUsage()).
Related component
Search:Performance
To Reproduce
This overhead will happen on any search, though for more expensive searches it may be less noticeable as the CPU will be dominated by other work.
Expected behavior
Assuming that the TaskResourceInfo is just being serialized for machine-to-machine communication, then it should use an efficient binary serialization to avoid the XContent/Jackson/JSON overhead in the hot path on searches.
Additional Details
Here is a zoomed-in snippet of a profile showing getTaskResourceUsageFromThreadContext():
This code path was introduced in OpenSearch as a way to capture "query-level" resource usages. I'll look into this for a more efficient way to send the usages data.
Describe the bug
I profiled an OpenSearch server running the
term
query operation from thebig5
OSB workload. This query is very fast, so the intent was to find any overhead not related to doing the actual work of searching indexes that could be optimized. The surprising finding is thatTaskResourceTrackingService
takes a little over 7% of the total CPU cycles. A big chunk of that work is simply marshaling theTaskResourceInfo
object to and from a JSON string (ingetTaskResourceUsageFromThreadContext
andwriteTaskResourceUsage()
).Related component
Search:Performance
To Reproduce
This overhead will happen on any search, though for more expensive searches it may be less noticeable as the CPU will be dominated by other work.
Expected behavior
Assuming that the
TaskResourceInfo
is just being serialized for machine-to-machine communication, then it should use an efficient binary serialization to avoid the XContent/Jackson/JSON overhead in the hot path on searches.Additional Details
Here is a zoomed-in snippet of a profile showing getTaskResourceUsageFromThreadContext():