hyperledger / aries-cloudagent-python

Hyperledger Aries Cloud Agent Python (ACA-Py) is a foundation for building decentralized identity applications and services running in non-mobile environments.
https://wiki.hyperledger.org/display/aries
Apache License 2.0
407 stars 511 forks source link

Support for Prometheus Metrics #1385

Open solidnerd opened 3 years ago

solidnerd commented 3 years ago

Hey, there are two ways of supporting Prometheus metrics.

  1. Instrument the ACA-Py directly with a Prometheus Client Library or using OpenTelemetry
  2. Building an external Prometheus Exporter to scrape the ACA-PY API and exposing the relevant information.

Since my Python Skills are not very high I probably can't do it myself but I really would like to support here to get this implemented.

What do you think ?

swcurran commented 3 years ago

@ianco — given you did some work on tracing and collection, are you able to provide some feedback on this one? Or a suggestion for someone else who could suggest something.

@solidnerd — perhaps a topic to raise on Hyperledger chat and/or to discuss at a User Group meeting.

ianco commented 3 years ago

The current event tracing and collection is primarily to track the message processing in a thread (for example during a credential exchange protocol) - part of this is tracking the time required to processes each step, so there is some overlap with what Prometheus does, but it is not a complete match.

From my (limited) knowledge of Prometheus you configure it to watch endpoints (such as the aca-py admin api, or the agent's published endpoint) and Prometheus can monitor these endpoints and collect statistics. I'm not familiar enough to comment on the two options above. (For the existing tracing we emit json-format events to a collector, such as EFK, I don't know if these can be collected directly by Prometheus or we would need to make code changes in aca-py.)

What metrics are you interested in collecting?

tomaaron commented 3 years ago

The current event tracing and collection is primarily to track the message processing in a thread (for example during a credential exchange protocol) - part of this is tracking the time required to processes each step, so there is some overlap with what Prometheus does, but it is not a complete match.

This is exactly a match! If you already have the event tracking in place, then Prometheus' job is to expose these metrics for other services to be consumed(e.g. Grafana to visualize these metrics)

From my (limited) knowledge of Prometheus you configure it to watch endpoints (such as the aca-py admin api, or the agent's published endpoint) and Prometheus can monitor these endpoints and collect statistics.

Exactly, this is called blackbox monitoring.

I'm not familiar enough to comment on the two options above. (For the existing tracing we emit json-format events to a collector, such as EFK, I don't know if these can be collected directly by Prometheus or we would need to make code changes in aca-py.)

My recommendation is to build a external Prometheus exporter so in a Kubernetes-based setup the exporter could run as side-car container. This separation of concerns helps to decouple the generation and exporting of metrics.

What metrics are you interested in collecting?

Everything that changes over time :) This should help to gain a better understanding of the state of aca-py.

JavierDonadoC commented 2 years ago

Hey,

Sorry for get back the topic again. I'm trying to do something like @solidnerd wanted. Since his issue was opened one year ago, anybody knows if there is a known solution? Regarding with @ianco message, are these metrics/info exported in an endpoint?