In hybrid query allow to skip parallel score collection by core TopDocsCollector

As part of performance optimization for Hybrid query we need to find a way to minimize time taken by parallel score collecting processes that is running in core with TopDocsCollector. As per following information collected during profiling these calls take 40-80% of CPU time.

As a baseline we're taking results from previous PR related to hybrid query optimization, those are based on 2.13 version and noaa OSB workload, all times are in ms:

One sub-query that selects 11M documents

Bool: p50 97.9306 | p90 116.299
Hybrid: p50 228.696 | p90 249.665

One sub-query that selects 1.6K documents

Bool: p50 87.3152 | p90 89.3061
Hybrid: p50 89.9654 | p90 92.349

Three sub-query that select 15M documents

Bool: p50 97.9891 | p90 114.396
Hybrid: p50 353.631 | p90 377.527

Most likely that will be a compound change in both core OpenSearch and neural-search plugin. Preferred suggestion is - provide capability to skip or ignore TopDocsCollector in core QueryPhaseSearcher (Core side) and by using that new option call only HybridQueryDocsCollector (plugin side).

https://github.com/opensearch-project/OpenSearch/issues/13170 - feature request in core for allowing caller to pass empty query collector context

opensearch-project / neural-search

In hybrid query allow to skip parallel score collection by core TopDocsCollector #729