In #5810, OpenTelemetry support was introduced. It would be valuable to extend this functionality by providing default metrics for the workflow engine, allowing for better observability and performance tracking. The goal of this issue is to gather feedback on which additional metrics the community would like to see included.
Proposed Metrics
Counters
Workflows Started: Total number of workflows that have been initiated.
Workflows Resumed: Total number of workflows that have resumed after being suspended.
Workflows Faulted: Total number of workflows that have encountered errors.
Workflows Suspended: Total number of workflows that have been paused.
Activities Executed: Total number of activities that have been performed.
Activities Faulted: Total number of activities that have resulted in errors.
Gauges
Active Workflows: Number of workflows currently in progress (i.e., not completed or terminated).
Histograms
Activity Execution Time: Distribution of execution times for individual activities.
Workflow Execution Time: Distribution of total execution times for entire workflows.
Community Input
This issue aims to collect suggestions from the community regarding additional metrics that would be useful for tracking the performance and health of the workflow engine.
In #5810, OpenTelemetry support was introduced. It would be valuable to extend this functionality by providing default metrics for the workflow engine, allowing for better observability and performance tracking. The goal of this issue is to gather feedback on which additional metrics the community would like to see included.
Proposed Metrics
Counters
Gauges
Histograms
Community Input
This issue aims to collect suggestions from the community regarding additional metrics that would be useful for tracking the performance and health of the workflow engine.