qmuntal / stateless

Go library for creating finite state machines
BSD 2-Clause "Simplified" License
942 stars 49 forks source link

Panic when firing trigger #12

Closed rickardgranberg closed 4 years ago

rickardgranberg commented 4 years ago

We're seeing an intermittent problem with a panic when a trigger is fired from a callback (in a separate goroutine) The call stack is as follows:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x6c78a1]

goroutine 711 [running]:
container/list.(*List).insert(...)
        /usr/local/go/src/container/list/list.go:96
container/list.(*List).insertValue(...)
        /usr/local/go/src/container/list/list.go:104
container/list.(*List).PushBack(...)
        /usr/local/go/src/container/list/list.go:155
github.com/qmuntal/stateless.(*StateMachine).internalFireQueued(0xc00004c4e0, 0xebfb40, 0xc00019e010, 0xc589e0, 0xe9cd30, 0x0, 0x0, 0x0, 0x0, 0x0)
        /go/pkg/mod/github.com/qmuntal/stateless@v1.1.3/statemachine.go:296 +0x4c1
github.com/qmuntal/stateless.(*StateMachine).internalFire(0xc00004c4e0, 0xebfb40, 0xc00019e010, 0xc589e0, 0xe9cd30, 0x0, 0x0, 0x0, 0x0, 0x0)
        /go/pkg/mod/github.com/qmuntal/stateless@v1.1.3/statemachine.go:289 +0x89
github.com/qmuntal/stateless.(*StateMachine).FireCtx(...)
        /go/pkg/mod/github.com/qmuntal/stateless@v1.1.3/statemachine.go:227
github.com/qmuntal/stateless.(*StateMachine).Fire(...)
        /go/pkg/mod/github.com/qmuntal/stateless@v1.1.3/statemachine.go:220

... our code omitted...

We've been seeing this on and off when we run our unit tests so I decided to investigate. I believe theres a concurrency issue in the internalFireQueued function.

Here's my analysis: Our code is running some code in the OnEntry func as it has just transitioned to a new state. The callback mentioned above fires a new trigger, which is adding the next trigger to the sm.eventQueue in https://github.com/qmuntal/stateless/blob/3ead642f6c11905eb551a5076e4b40169fba3626/statemachine.go#L296

At the same time as the new trigger is being added, the OnEntry func returns, and the internalFireQueued resumes execution on https://github.com/qmuntal/stateless/blob/3ead642f6c11905eb551a5076e4b40169fba3626/statemachine.go#L308

The problematic lines of code at this point are:

https://github.com/qmuntal/stateless/blob/3ead642f6c11905eb551a5076e4b40169fba3626/statemachine.go#L311 https://github.com/qmuntal/stateless/blob/3ead642f6c11905eb551a5076e4b40169fba3626/statemachine.go#L312 https://github.com/qmuntal/stateless/blob/3ead642f6c11905eb551a5076e4b40169fba3626/statemachine.go#L318

All of these is accessing the sm.eventQueue which is being mutated by the callback on line https://github.com/qmuntal/stateless/blob/3ead642f6c11905eb551a5076e4b40169fba3626/statemachine.go#L296

Any access to the sm.eventQueue must be done within the scope of sm.firingMutex to ensure "thread" safety.

rickardgranberg commented 4 years ago

With the changes in PR #13, we're not seeing the issue any more

qmuntal commented 4 years ago

Fixed in v1.1.4