prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.07k stars 5.38k forks source link

Enhance Open Telemetry Implementation in Presto #23975

Open sureshbabu-areekara opened 2 weeks ago

sureshbabu-areekara commented 2 weeks ago

Open Telemetry is a powerful serviceability framework which helps to gain insights into the performance and behaviour of the systems. It facilitates generation, collection, and management of telemetry data such as traces.

The OSS Presto already had a basic implementation of Open Telemetry https://github.com/prestodb/presto/pull/18534, which was an experimental feature, had a limited set of telemetry data (Query state changes) and did not include a child span concept.

This enhancement will make Presto more flexible, allowing support for both parent and child spans. Additionally, spans can now be propagated to the worker nodes as well.

tdcmeehan commented 2 weeks ago

Please note that there is an SPI and engine integration and a plugin which integrates with OpenAPI. Can you describe the SPI and engine integration more thoroughly--for example, is there any change required to communicate span information in the worker back to the coordinator? Can you create an RFC for this design that outlines changes to the SPI, engine (communication for example of spans), and implementation of the Open Telemetry spec as a plugin?

sureshbabu-areekara commented 2 weeks ago

Hi @tdcmeehan We have raised RFC https://github.com/prestodb/rfcs/pull/33. Below is the proposed architecture.

image image

Please let me know if any more info required.

tdcmeehan commented 1 week ago

@sureshbabu-areekara thank you, but this diagram does not answer my questions. I have left more detailed feedback on the RFC. Please take a look.