it-at-m / digiwf-core

central workflow automation and integration platform based on the free process framework Camunda.
MIT License
19 stars 7 forks source link

Camunda Process Monitoring #1229

Closed simonhir closed 6 months ago

simonhir commented 8 months ago

Is your feature request related to a problem? Please describe.

As provider of the digiwf platform i want to know if after a release all processes work fine. For that it would be great to have a overview of started/running/completed and failed process instances. As a process developer i want to know if a release (digiwf or process definition) broke one of my processes and be notified if incidents are increasing.

Describe the solution you'd like

Graphical overview of started/running/completed/failed process instances (complete cluster and per process). A good solution for that would be grafana and using the camunda provided prometheus endpoint. Also there should be the possibility for notifications if incidents are increasing fast or 100% is failing. (Also possible with grafana)

Describe alternatives you've considered

There are no real alternatives. Other monitoring solutions like appd would only detect failures calling integrations not logical or internal script errors. These monitoring tools would be more of a extension for deeper insight.

Acceptance Criteria

Additional context

darenegade commented 7 months ago

Hey team! Please add your planning poker estimate with Zenhub @darenegade @lehju @lmoesle @markostreich @simonhir @StephanStrehlerCGI

darenegade commented 7 months ago

Please add your planning poker estimate with Zenhub @dominikhorn93

dominikhorn93 commented 7 months ago

auf der demo Umgebung funktioniert noch alles: https://grafana-route-digiwf-demo-capmanaged.apps.capk.muenchen.de/d/rjaygWhnk/camunda-dashboard?orgId=1

Es liegt hier vermutlich daran, dass es von der alten REST-API gezogen wurde und die nur noch in den Optimize Umgebungen läuft.

darenegade commented 7 months ago

Entscheidungs-Doku: Das Dashboard wird auch nur auf vollwertigen Umgebungen inklusive Optimize benötigt. Daher muss hier nichts gefixed werden, nur Doku angepasst mit den Links.

Dashboardanpassung und Alert ist weiterhin Teil des Tickets

zambrovski commented 7 months ago

Wir messen aber deutlich zu wenig. Ich werde weitere Monitore ergänzen und dann schauen wir weiter.

darenegade commented 6 months ago

@simonhir schaut noch, ob das auf den Umgebungen alles läuft. Dann wird das hier geschlossen

simonhir commented 6 months ago

Funktioniert, weiter Analyse und Anpassungen im Rahmen von #1251