We need to make sure that we can publish information allowing the performance of the service to be measured in a Prometheus compatible form on a suitable port or posts (The info should be entries that belong in a time series, Prometheus will take care of counting, averaging etc). This should include things like
Notification each time a run of a plan is started, paused, resumed, aborted, stopped, completed
Notification each time a failure at the Ophyd level occurs occurs
Notification each time a Run Engine error occurs - e.g. if it can't run a plan
Notification each time a Document is emitted plus info about its type etc.
Notification each time an Error occurs in response to a REST API request
We need to make sure that we can publish information allowing the performance of the service to be measured in a Prometheus compatible form on a suitable port or posts (The info should be entries that belong in a time series, Prometheus will take care of counting, averaging etc). This should include things like