sartography / spiff-arena

SpiffWorkflow is a software development platform for building, running, and monitoring executable diagrams
https://www.spiffworkflow.org/
GNU Lesser General Public License v2.1
62 stars 42 forks source link

Handling of BPMN Error Events #2037

Open essweine opened 1 month ago

essweine commented 1 month ago

BPMN Error Events should be handled like exceptions: they should propogate up the call activity/subprocess stack and if are not caught at the top level, caught by the engine. The instance would be terminated by default, but we might offer the option of suspending the instance instead.

To get this working as expected, we would need to

calexh-sar commented 2 weeks ago

@madhurrya would you please get with Jon to discuss testing this issue.

madhurrya commented 2 weeks ago

@calexh-sar Sure will do. Is it 'ready for QA'?

madhurrya commented 2 weeks ago

@burnettk please let me know when this PR (and the library updates) are deployed to somewhere I can test. https://github.com/sartography/spiff-arena/pull/2058

madhurrya commented 2 weeks ago
  1. Call activity related issue

This model shows the unhandled Error as expected. https://mysql.spiffcrm.com/process-models/misc:qa:bpmn-model-testing:unit-test-call-activity-with-error-end-event

But this doesn't https://mysql.spiffcrm.com/process-models/misc:call-activity-error-end-event:level-2 and this too doesn't https://mysql.spiffcrm.com/process-models/misc:qa:bpmn-model-testing:unit-test-call-activity-with-error-boundary-event

I am not sure why. Is it because the first one is inside a sub process and these two are not? But I don't think the error end events has to be inside a sub process always.

madhurrya commented 2 weeks ago
  1. Expanded sub process related issue This models shows the unhandled Error as expected. https://mysql.spiffcrm.com/process-models/misc:qa:bpmn-model-testing:unit-test-error-end-event-3

But this doesn't https://mysql.spiffcrm.com/process-models/misc:qa:bpmn-model-testing:unit-test-expanded-sub-process-with-error-end-event

It seems the End event instruction is causing this issue. If I remove it, it seems to show the error.

Update : it seems the same happens with collapsed sub processes also if there are end event instructions

madhurrya commented 2 weeks ago
  1. Call activity related issue

In both these models I have called the same error end event model. This one shows the unhandled Error https://mysql.spiffcrm.com/editor/process-models/misc:qa:bpmn-model-testing:unit-test-call-activity-with-error-end-event/files/unit-test-call-activity-with-error-end-event.bpmn

But this one doesn't https://mysql.spiffcrm.com/editor/process-models/misc:qa:call-activity-tests:call-activity-level-2/files/call-activity-level-2.bpmn

madhurrya commented 1 week ago

@jbirddog the above issues are fixed now. All of them show the unhandled Error now.

But noticed this behavior.

The ones that were working as expected earlier, when we run them after displaying the error those instances completes. e.g. https://mysql.spiffcrm.com/i/477 image

https://mysql.spiffcrm.com/i/480 image

https://mysql.spiffcrm.com/i/482 image

But the ones that were not working last time, they show the error message now. But those instances don't get completed. They seem to be stuck at the next task in yellow. e.g https://mysql.spiffcrm.com/i/478 image

https://mysql.spiffcrm.com/i/479 image

https://mysql.spiffcrm.com/i/481 image

https://mysql.spiffcrm.com/i/483 image

It looks odd and I feel this has to be fixed.

madhurrya commented 1 week ago

@jbirddog When I run a Multi instance sub process with an error endevent, it shows the error at the first instance and it doesn't seem to run the other instances. It seems to be stuck there like the above instances https://mysql.spiffcrm.com/i/493 image

When I ran that model before the current fix, it ran all the instances and showed the error at the end.

I am not sure what the exact behavior should be for multi instances.

jbirddog commented 1 week ago

This is due to the bug fix - before we were waiting until the end to see if there were any unhandled events, but this is not really correct. For instance if there was an unhandled event then 10 human tasks, we shouldn't wait until the end to try and show it. Now with the fix as soon as Spiff tells us there is an unhandled event in the workflow we error out. This is much closer to what you would expect from an uncaught exception in traditional programming.

madhurrya commented 1 week ago

@jbirddog Ok thanks for the explanation. But I am not clear why it completes the first 3 instances I have shared above and the next 3 similar to them are in yellow. If we stop at the point it failed, then should the next task become active?