StephenOTT / camunda-prometheus-process-engine-plugin

Monitor your KPIs!!! Camunda BPM Process Engine Plugin providing Prometheus Monitoring, Metric classes for various BPMN use, Grafana Annotations, and HTTPServer data export: Used to generate Prometheus metrics anywhere in the Engine, including BPMN, CMN, and DMN engines and instances.
MIT License
53 stars 24 forks source link

Exception while closing command context #25

Closed Citrullin closed 4 years ago

Citrullin commented 5 years ago

Env: Camunda 7.9 in Tomcat 9.0.5

Steps to reproduce.: I created the jar package and moved it into the tomcat library folder. Added the configuration to the bpm-platform.xml

...
        <plugin>
            <class>io.digitalstate.camunda.prometheus.PrometheusProcessEnginePlugin</class>
        <properties>
            <property name="port">9999</property>
            <property name="camundaReportingIntervalInSeconds">5</property>
            <property name="collectorYmlFilePath">/usr/local/bin/camunda/server/apache-tomcat-9.0.5/conf/prometheus.yml</property>
            <property name="bpmnDurationParseListener">true</property>
        </properties>   
        </plugin>
...

Added the basic prometheus metrics to the prometheus.yml

---
system:
- collector: io.digitalstate.camunda.prometheus.collectors.camunda.BpmnExecution
  enable: true
  startDate: 2015-10-03T17:59:38+00:00
  endDate: now
  startDelay: 0
  frequency: 5000
- collector: io.digitalstate.camunda.prometheus.collectors.camunda.DmnExecution
  enable: true
  startDate: 2015-10-03T17:59:38+00:00
  endDate: now
  startDelay: 0
  frequency: 5000
- collector: io.digitalstate.camunda.prometheus.collectors.camunda.JobExecutor
  enable: true
  startDate: 2015-10-03T17:59:38+00:00
  endDate: now
  startDelay: 0
  frequency: 5000

Restarted the server. Fails with the following log message:

[io.digitalstate.camunda.prometheus.collectors.camunda.BpmnExecution timer] org.camunda.commons.logging.BaseLogger.logError ENGINE-16004 Exception while closing command context: null
 java.lang.NullPointerException
    at org.camunda.bpm.engine.impl.persistence.entity.MeterLogManager.executeSelectSum(MeterLogManager.java:51)
    at org.camunda.bpm.engine.impl.metrics.MetricsQueryImpl$2.execute(MetricsQueryImpl.java:106)
    at org.camunda.bpm.engine.impl.metrics.MetricsQueryImpl.execute(MetricsQueryImpl.java:117)
    at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
    at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104)
    at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:66)
    at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30)
    at org.camunda.bpm.engine.impl.metrics.MetricsQueryImpl.sum(MetricsQueryImpl.java:111)
    at io.digitalstate.camunda.prometheus.collectors.camunda.BpmnExecution.collectActivityInstancesStarted(BpmnExecution.java:48)
    at io.digitalstate.camunda.prometheus.collectors.camunda.BpmnExecution.collectAll(BpmnExecution.java:88)
    at io.digitalstate.camunda.prometheus.collectors.camunda.BpmnExecution$1.run(BpmnExecution.java:24)
    at java.util.TimerThread.mainLoop(Timer.java:555)
    at java.util.TimerThread.run(Timer.java:505)
StephenOTT commented 5 years ago

@Citrullin did you figure out what caused this? The most common NPE i have found is from the yaml file being inaccessible: either from a bad path or Filesystem permissions.

Just to confirm, if you use the docker example in the repo code, does it work on your machine?

Citrullin commented 5 years ago

@StephenOTT I checked this already before and changed it to the user camunda and group camunda. I start the Camunda Engine via the user camunda. So, shouldn't be the issue. Tried it again and changed the yaml file to 777. Still, same NPE. Since the NPE is coming from the JobExecutor I don't think it's an issue with the access rights.

    at io.digitalstate.camunda.prometheus.collectors.camunda.JobExecutor.collectJobSuccessful(JobExecutor.java:48)
    at io.digitalstate.camunda.prometheus.collectors.camunda.JobExecutor.collectAll(JobExecutor.java:209)
    at io.digitalstate.camunda.prometheus.collectors.camunda.JobExecutor$1.run(JobExecutor.java:24)

Probably has something to do with the different environment I am running it.

StephenOTT commented 5 years ago

Given the older version of Camunda you are running, likely something changed.
You are stuck with cam 9.0.x?

Citrullin commented 5 years ago

@StephenOTT It is Tomcat version 9.0.5, not Camunda. We are running Camunda in version 7.9. Since 7.10 is the newest Camunda version, this shouldn't be an issue.

StephenOTT commented 5 years ago

Opps ! sorry yes missed that. I will test against 7.10 and see where we get!

StephenOTT commented 5 years ago

@Citrullin can you provide some more details about this issue? I am not able to reproduce.

7.10 appears to be functional as per unit tests. Are you able to make this project's unit tests fail ?

Citrullin commented 5 years ago

@StephenOTT I have a different workplace now. Not working with Camunda anymore. But I will email my ex-colleagues. :)

StephenOTT commented 4 years ago

Closing. Please reopen if you develop repeatable steps