yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.9k stars 1.06k forks source link

[DocDB] Add the ability for the master & tserver to export additional metrics from a file #23227

Open iSignal opened 2 months ago

iSignal commented 2 months ago

Jira Link: DB-12165

Description

On k8s pods, YBA collects a few additional metrics that are of interest, for ex: sum of PG RSS, count of active/inactive PG conns, days to DB cert expiration etc. Scraping these on k8s requires yet another agent so it would be great if the tserver could be used to collect these metrics instead.

Proposal: The tserver supports a flag for --additional_metrics_file=<path>. When specified, the tserver reads metrics in standard prom format (key value) from this file into memory every 10s and returns them along with its regular metrics endpoint. The full prom format does not need to be supported, just the metric names with labels and values. The additional metrics file is expected to contain less than 100 metrics in general.

@yorq @amannijhawan @anmalysh-yb @lingamsandeep

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

amannijhawan commented 2 months ago

For reference here is a similar functionality that node exporter implements. https://github.com/prometheus/node_exporter/blob/b9d0932179a0c5b3a8863f3d6cdafe8584cedc8e/collector/textfile.go In the node exporter interface it seems like the textfile collector takes a directory as an input and reads all files in the directory and adds to metrics.

Another thing we need to ensure is that we can use an atomic write model for writing the metrics file to make sure some partial files are not read for metrics.