MarquezProject / marquez

Collect, aggregate, and visualize a data ecosystem's metadata
https://marquezproject.ai
Apache License 2.0
1.68k stars 293 forks source link

Expose HTTP endpoint SQL queries, queries count and execution time via Prometheus #2828

Open tatiana opened 1 month ago

tatiana commented 1 month ago

Context

Since the 0.7.0 release (#1906), Marquez supports pushing metrics to Prometheus.

This task proposes extending the current capability to give visibility to Marquez's SQL queries. Some of the questions we'd like to be answered:

By identifying potential bottlenecks in Marquez queries and the database, this extension could facilitate the provisioning of adequate resources. This, in turn, could lead to improved performance and efficiency of the database and Marquez itself.

Implementation

If possible, we could give visibility of frequency (count) and duration (gauge) for all queries Marquez runs. There is a possibility this could be done close to jdbi: https://metrics.dropwizard.io/4.2.0/manual/jdbi.html

If this is not possible, we could add the instrumentation to specific write and read endpoints, covering at least the SQL queries triggered by the following endpoints:

The most critical are (*)

boring-cyborg[bot] commented 1 month ago

Thanks for opening your first issue in the Marquez project! Please be sure to follow the issue template!

tatiana commented 4 weeks ago

As of Marquez 0.47.0, the /metrics endpoint already exposes the following information:

Example of the information made available in this endpoint:

Screenshot 2024-06-06 at 14 28 20

Confirm if we need further details.

tatiana commented 4 days ago

At the moment, we can see the data of interest using the Java method, but we desire this feature to allow us to see it from an HTTP endpoint perspective as well (e.g., POST api/v1/lineage).

tatiana commented 1 day ago

@mobuchowski 's suggest: add the HTTP verb and the endpoint path E.g. marquez_api_post_v1Lineage_db_JobDao

We need to investigate if and how this could be accomplished, and if there are better ways