temporalio / sdk-core

Core Temporal SDK that can be used as a base for language specific Temporal SDKs
MIT License
262 stars 70 forks source link

[Bug] Core should not keep retrying local activities indefinitely after worker shutdown has been requested #709

Open bergundy opened 5 months ago

bergundy commented 5 months ago

Today Core automatically retries local activities after worker shutdown has been requested. In addition to that all local activities are cancelled after the shutdown grace period has expired.

The combination of the two can very easily put the worker in a loop where activity attempts are being exhausted very fast and prevent the worker from shutting down.

It would be better to stop scheduling activity retries after worker shutdown has been requested. Ideally the progress wouldn't be lost and Core would store progress information in a marker but initially failing the workflow task and letting the server schedule a retry on the next available worker is preferable to the current behavior.