Open karteekmurthys opened 3 months ago
cc: @majetideepak @amitkdutta
@pramodsatya gathered some stats on total time spent at different locations in Presto CPP where RECORD_HISTOGRAM is called. The Q67 query was run for over an hour and PrestoExchangeSource.cpp:117
took total of 3256278906 us in an hour of query run.
reg/add function line num
Add addRootPool Memory.cpp:248 76
clearThread Driver.h:153 166326
operator() ExchangeClient.cpp:163 87166
ExchangeClient.cpp:167 61760
ExchangeClient.cpp:170 31759
FileHandle.cpp:64 45
PrestoExchangeSource.cpp:117 3256278906
runInternal Driver.cpp:522 348599
updatePrestoExchangeSourceMemoryStats PeriodicTaskManager.cpp:268 18196
~MemoryPoolImpl MemoryPool.cpp:462 45
~ScopedArbitration SharedArbitrator.cpp:880 1110
SharedArbitrator.cpp:910 12
What if the implementation instead created a subprocess like the exposer mentioned https://github.com/jupp0r/prometheus-cpp. Resources can be controlled when starting the exposer. In uber we use the exposer as a separate process since we don't have to worry about reporting/maintainng the new http server
I don't think this will help directly here^ since I could repro this internally.
PrometheusStatsReporter updates worker stats synchronously. This is fine as most of the stats are updated through PeriodicTaskManager in a separate thread periodically without blocking queries.
But this specific metric is updated in the query path and is updated way too frequently.
The Q67 of SF10K TPCDS workload takes several hours to finish since we merged PrometheusStatsReporter. We are working on how to improve this in the stats reporter but we should also avoid reporting metrics in the query path if possible.
Request to review if
presto_cpp.http.client.presto_exchange_source.on_body_bytes
can be removed or made optional for tracking.We tested Q67 at SF10K by removing this metric and the query completes successfully.
We also have a workaround of updating histogram asynchronously in PrometheusStatsReporter. This solution was tested successfully as well against Q67