qmuntal / stateless

Go library for creating finite state machines
BSD 2-Clause "Simplified" License
898 stars 47 forks source link

Optimizing findHandler #64

Closed k-raval closed 10 months ago

k-raval commented 10 months ago

Hi,

We are extensively using this FSM library in our codebase and thanks for the same.

While profiling the code, we found that the findHandler function allocates lot of objects on heap and which later get recycled by garbage collector, however that takes CPU cycles and affects the overall performance.

(pprof) list findHandler
Total: 345800393
ROUTINE ======================== github.com/qmuntal/stateless.(*stateRepresentation).findHandler in mobility/vendor/github.com/qmuntal/stateless/states.go
  16884695   16884695 (flat, cum)  4.88% of Total
         .          .     76:       possibleBehaviours []triggerBehaviour
         .          .     77:   )
         .          .     78:   if possibleBehaviours, ok = sr.TriggerBehaviours[trigger]; !ok {
         .          .     79:       return
         .          .     80:   }
   5598416    5598416     81:   allResults := make([]triggerBehaviourResult, 0, len(possibleBehaviours))
         .          .     82:   for _, behaviour := range possibleBehaviours {
         .          .     83:       allResults = append(allResults, triggerBehaviourResult{
         .          .     84:           Handler:              behaviour,
         .          .     85:           UnmetGuardConditions: behaviour.UnmetGuardConditions(ctx, args...),
         .          .     86:       })
         .          .     87:   }
   5624428    5624428     88:   metResults := make([]triggerBehaviourResult, 0, len(allResults))
   5661851    5661851     89:   unmetResults := make([]triggerBehaviourResult, 0, len(allResults))
         .          .     90:   for _, result := range allResults {
         .          .     91:       if len(result.UnmetGuardConditions) == 0 {
         .          .     92:           metResults = append(metResults, result)
         .          .     93:       } else {
         .          .     94:           unmetResults = append(unmetResults, result)

I think for each trigger, all the possible behaviours are known during configuration (i.e. call to Configure/PermitDynamic/....). So can we not populate what is required once and use it every time during findHandler.

The other two slices (metResults, unmetResults) can also be pre-allocated/pre-populated during configuration.

Let us know as it will save a lot of memory allocations and thereby improve performance.

Thanks.

qmuntal commented 10 months ago

Thanks for reporting. I'm always happy to improve perf bottlenecks. I'll give it a shot.

k-raval commented 10 months ago

Thanks qmuntal for quick response and quick fix as well.

Your fix is working as expected. Now the 'findHandler' function is not in the top allocator list.

As soon as you merge and create a new release tag, we will be able to take it in.

qmuntal commented 10 months ago

Fixed in https://github.com/qmuntal/stateless/releases/tag/v1.6.6.