camunda / camunda

Distributed Workflow Engine for Microservices Orchestration
https://camunda.com/platform/
3.09k stars 562 forks source link

BpmnProcessingException: Expected to activate the none start event of the process but not found #7669

Open saig0 opened 2 years ago

saig0 commented 2 years ago

Describe the bug I deployed a process with a timer start event that triggers every 10 seconds. The process worked fine for some time. But at some point, I saw the following failures in the log:

io.camunda.zeebe.engine.processing.bpmn.BpmnProcessingException: 
Expected to activate the none start event of the process but not found. 
[context: {intent=ELEMENT_ACTIVATED, elementId=timer-event-process, bpmnElementType=PROCESS, 
elementInstanceKey=2251799987594468, flowScopeKey=-1, processInstanceKey=2251799987594468, 
parentProcessInstanceKey=-1, parentElementInstanceKey=-1, bpmnProcessId=timer-event-process, 
processVersion=2, processDefinitionKey=2251799813814592}]

Looking at Operate, it seems that the duration between two timers increases over time:

image

image

image

image

image

Until it takes minutes to trigger the next timer.

image

To Reproduce It is not clear what causes the problem.

The process has one timer start event with a timer cycle R/PT10S. See the BPMN (from our smoke-test project :wink:)

image

Expected behavior It creates a new instance of the process every 10 seconds.

Log/Stacktrace

Full Stacktrace

``` io.camunda.zeebe.engine.processing.bpmn.BpmnProcessingException: Expected to activate the none start event of the process but not found. [context: {intent=ELEMENT_ACTIVATED, elementId=timer-event-process, bpmnElementType=PROCESS, elementInstanceKey=2251799982896084, flowScopeKey=-1, processInstanceKey=2251799982896084, parentProcessInstanceKey=-1, parentElementInstanceKey=-1, bpmnProcessId=timer-event-process, processVersion=2, processDefinitionKey=2251799813814592}] at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.activateNoneStartEvent(ProcessProcessor.java:116) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.lambda$activateStartEvent$7(ProcessProcessor.java:106) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at java.util.Optional.ifPresentOrElse(Unknown Source) ~[?:?] at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.activateStartEvent(ProcessProcessor.java:103) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.lambda$onActivate$2(ProcessProcessor.java:61) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:238) ~[zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.onActivate(ProcessProcessor.java:60) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.onActivate(ProcessProcessor.java:27) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processEvent$3(BpmnStreamProcessor.java:202) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:238) ~[zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processEvent(BpmnStreamProcessor.java:201) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processRecord$1(BpmnStreamProcessor.java:133) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:238) ~[zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processRecord(BpmnStreamProcessor.java:130) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.streamprocessor.TypedRecordProcessor.processRecord(TypedRecordProcessor.java:54) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.lambda$processInTransaction$3(ProcessingStateMachine.java:295) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run(ZeebeTransaction.java:84) ~[zeebe-db-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.processInTransaction(ProcessingStateMachine.java:285) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.processEvent(ProcessingStateMachine.java:253) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.tryToReadNextEvent(ProcessingStateMachine.java:219) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.readNextEvent(ProcessingStateMachine.java:210) ~[zeebe-workflow-engine-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.1.0.jar:1.1.0] at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.1.0.jar:1.1.0] ```

Error Reporing Tool

Log Entry

Grafana Dashboard

Camunda Cloud Cluster-Id: 29c3ebf3-393e-447b-b739-5a56e64e74b5

Environment:

korthout commented 1 year ago

Marking priority as later because improving timer behavior is not our main priority right now.

Please comment if you think this should have a higher priority.

korthout commented 1 year ago

Also, marking severity as high because there is no known workaround

saig0 commented 12 months ago

I saw this failure 14 times in your benchmark zeebe:medic-y-2023-cw-26-6750100-benchmark-mixed for the process msg_one_task.

korthout commented 12 months ago

@saig0 Can you elaborate? Where did you see this failure?

saig0 commented 11 months ago

@korthout the failure was reported in the error reporting tool here. It was caused by our Zeebe benchmark on running the msg_one_task.bpmn process.

Stack trace:

io.camunda.zeebe.engine.processing.bpmn.BpmnProcessingException: Expected to activate the none start event of the process but not found. [context: {intent=ELEMENT_ACTIVATED, elementId=msg_one_task, bpmnElementType=PROCESS, elementInstanceKey=2251799826058417, flowScopeKey=-1, processInstanceKey=2251799826058417, parentProcessInstanceKey=-1, parentElementInstanceKey=-1, bpmnProcessId=msg_one_task, processVersion=1, processDefinitionKey=2251799813686384}]
    at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.activateNoneStartEvent(ProcessProcessor.java:119) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.lambda$activateStartEvent$5(ProcessProcessor.java:109) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at java.util.Optional.ifPresentOrElse(Unknown Source) ~[?:?]
    at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.activateStartEvent(ProcessProcessor.java:106) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.onActivate(ProcessProcessor.java:59) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.container.ProcessProcessor.onActivate(ProcessProcessor.java:27) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processEvent$2(BpmnStreamProcessor.java:102) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:381) ~[zeebe-util-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processEvent(BpmnStreamProcessor.java:101) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processRecord$0(BpmnStreamProcessor.java:84) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:381) ~[zeebe-util-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processRecord(BpmnStreamProcessor.java:81) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.engine.Engine.process(Engine.java:142) ~[zeebe-workflow-engine-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.stream.impl.ProcessingStateMachine.batchProcessing(ProcessingStateMachine.java:347) ~[zeebe-stream-platform-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.stream.impl.ProcessingStateMachine.lambda$processCommand$2(ProcessingStateMachine.java:268) ~[zeebe-stream-platform-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run(ZeebeTransaction.java:98) ~[zeebe-db-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.stream.impl.ProcessingStateMachine.processCommand(ProcessingStateMachine.java:268) ~[zeebe-stream-platform-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.stream.impl.ProcessingStateMachine.tryToReadNextRecord(ProcessingStateMachine.java:227) ~[zeebe-stream-platform-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.stream.impl.ProcessingStateMachine.readNextRecord(ProcessingStateMachine.java:203) ~[zeebe-stream-platform-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.scheduler.ActorJob.invoke(ActorJob.java:92) [zeebe-scheduler-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.scheduler.ActorJob.execute(ActorJob.java:45) [zeebe-scheduler-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.scheduler.ActorTask.execute(ActorTask.java:119) [zeebe-scheduler-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.scheduler.ActorThread.executeCurrentTask(ActorThread.java:106) [zeebe-scheduler-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.scheduler.ActorThread.doWork(ActorThread.java:87) [zeebe-scheduler-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT]
    at io.camunda.zeebe.scheduler.ActorThread.run(ActorThread.java:203) [zeebe-scheduler-8.3.0-SNAPSHOT.jar:8.3.0-SNAPSHOT] 
korthout commented 11 months ago

Encountered on production (error). I'm increasing the priority to upcoming, so we can plan fixing it for the next iteration or the one after.

I've stored the logs here.

korthout commented 10 months ago

Depending on the maxCommandsInBatch setting, the publish message command is rejected or not:

korthout commented 10 months ago

Root cause:

https://github.com/camunda/zeebe/blob/c613d577c00da92baf3a800820633bc5a8d621f3/engine/src/main/java/io/camunda/zeebe/engine/processing/bpmn/container/ProcessProcessor.java#L100-L109

korthout commented 10 months ago

How this situation can actually occur is unclear. The only EventTriggers that are created in the processDefinitionKey's eventscope are the message/signal/timer start events.

aleksander-dytko commented 1 month ago

ZPA Planning: