temporalio / sdk-java

Temporal Java SDK
https://temporal.io
Apache License 2.0
209 stars 142 forks source link

Workers don't reset sticky queue when workflow execution is evicted from the cache #883

Open Spikhalskiy opened 2 years ago

Spikhalskiy commented 2 years ago

Right now we reset the sticky queue if an exception happens during workflow execution. While this is not an event that is needed, there is nothing bad in reexecuting on the same worker. At the same time, we are missing resetting the sticky queue when a workflow gets evicted from the cache because SDK is at the workflow threads limit. This creates pressure on already overloaded workers and can lead to incremented delays.

Spikhalskiy commented 2 years ago

It was implemented like this by design in #236. The original intentions need to be revisited though because not resetting the sticky queue on evictions from already overwhelmed workers doesn't make much sense.

Spikhalskiy commented 2 years ago

This matter improved with this Server change: https://github.com/temporalio/temporal/pull/2811 Now if the sticky queue is obviously abandoned, Server will not wait 5 seconds trying to dispatch the workflow task into a sticky queue.