spring-projects / spring-statemachine

Spring Statemachine is a framework for application developers to use state machine concepts with Spring.
1.56k stars 613 forks source link

stateMachine.stop() does not stop sub state machine #478

Open ymyfrank opened 6 years ago

ymyfrank commented 6 years ago

Release 1.2.7 I use Papyrus neo to design state machine definition. The language used: SPEL State machine instances are pooled There are concurrent jobs running each represented by state machine context. (5 jobs / seconds) There are 1 main state machine and 4 sub state machine defined in uml Problems found: action defined in sub state machine state sometimes not executed by state machine Root cause from my point of view: stateMachine.stop() does not stop sub state machine(when state machine restore job/statemachine context) Solid proof: I have to hack the code like below to solve the problem (stop all sub state machines before restore job/statemachine context:

  1. declare a field stop state machine function private StateMachineFunction<StateMachineAccess<String, String>> stopStateMachineFunction = new StateMachineFunction <StateMachineAccess<String, String>>() { @Override public void apply(StateMachineAccess<String, String> function) { StateMachine sm = (StateMachine) function; sm.stop(); } };
  2. stop all sub state machines before restore state machine context: // get pooled statemachine instance stateMachine = (StateMachine<String, String>) poolTargetSource.getTarget();

        stateMachine.getStateMachineAccessor().doWithAllRegions(stopStateMachineFunction);

    This issues was found during our performance testing

ymyfrank commented 6 years ago

Attach the state machine uml we used model.zip

ymyfrank commented 6 years ago

I expected the issue to be fixed, so that I can remove that hack code

ymyfrank commented 6 years ago

I can not send you all my test scripts and the deployed target system to be tested, for they are quite complex. Anyway I put my reproduce steps here, maybe it can help to reproduce. Reproduce step for state machine definition attached model.zip: Run 100 jobs in total, (each with unique context stored in memory/redis) Each job only does first 2 steps, (for simplicity do not do all of the steps to finish job) Run 5 jobs per second. There is possibility that entry action not called in first state "GetSensusNodes" in sub state machine "StateMachineCensus". If can not reproduce, rerun the above test.