WebOfTrust / keria

KERI Agent in the cloud
https://keria.readthedocs.io/en/latest/
Apache License 2.0
14 stars 26 forks source link

Add instrumentation to KERIA #259

Open pfeairheller opened 1 week ago

pfeairheller commented 1 week ago

Investigate Prometheus or similar instrumentation library to add metrics to each Doer to capture processing occurring for any installation of KERIA

SmithSamuelM commented 1 week ago

I looked briefly at Prometheus (really just read the readme) but the description reminds of a lot of the tooling that we had to build for autonomous vehicle systems.

A lot of similar (to Prometheus) features were implemented in the Ioflo the Logger, Log, and Loggee behaviors in Ioflo which we would have gotten for free if I had had the time to port them to Hio. Anyway not sure how Prometheus plugs in but it seems reasonable at least from a high level description.

Anyway it makes me want to put metrified logs on the road map for Hio. Not something that is going to happen anytime soon so not something to wait for but certainly would be advantageous down the road to have more tightly integrated metrified logging for Hio Doers and DoDoers that eats its own dog food for flow based architecture systems logging.

For example the logging code in ioflo < 1000 lines because it leverages all the ioflo supporting libraries. A Logger is a special purpose DoDoer like object for logging A Log is a file that can have multiple log streams (loggees) A Loggee is a routed log stream that logs a flow buffer location So config is specifying the path to the Loggee and the rules for triggering when to save a value to the log file.

The graphing etc that Prometheus does I did with scripts I wrote for MatPlotLib (back in the day). So for example this video

https://github.com/WebOfTrust/keria/assets/602685/c3068f6f-04a4-4ebe-8455-a49b03617f44

rubelhassan commented 1 day ago

I've explored Prometheus python client instrumenting, four types of metric are offered: Counter, Gauge, Summary, and Histogram.

Could you please elaborate on what behaviors or metrics we're interested in about any specific doer. For example, how many times/fequently is a doer invoked of an agent, or resource consumption of a doer of an agent for a specific task. This could be helpful to assess the feasibility of using Prometheus.