Open Spikhalskiy opened 2 years ago
It was implemented like this by design in #236. The original intentions need to be revisited though because not resetting the sticky queue on evictions from already overwhelmed workers doesn't make much sense.
This matter improved with this Server change: https://github.com/temporalio/temporal/pull/2811 Now if the sticky queue is obviously abandoned, Server will not wait 5 seconds trying to dispatch the workflow task into a sticky queue.
Right now we reset the sticky queue if an exception happens during workflow execution. While this is not an event that is needed, there is nothing bad in reexecuting on the same worker. At the same time, we are missing resetting the sticky queue when a workflow gets evicted from the cache because SDK is at the workflow threads limit. This creates pressure on already overloaded workers and can lead to incremented delays.