dhiaayachi / temporal

Temporal service
https://docs.temporal.io
MIT License
0 stars 0 forks source link

Do not let API call timeout if workflow can't be locked #383

Open dhiaayachi opened 1 month ago

dhiaayachi commented 1 month ago

Currently if entire context timeout will be used to lock the workflow to perform operations. However, if workflow is super busy and workflow can't be locked within the given context timeout, caller side will see a context deadline exceeded error and has no clue why the API call times out.

We should return early with a special error type (or maybe just resource exhausted with workflow busy cause?) if workflow can not be locked.

Then the user latency calculation can also function properly across API calls.

dhiaayachi commented 4 weeks ago

Thank you for reporting this issue!

We understand the concern about the lack of clarity when a workflow fails to lock due to being too busy. Returning a more specific error code like "Resource Exhausted" with a "workflow busy" cause would be helpful for users to identify and troubleshoot the problem.

Currently, we don't have a specific error type for this scenario. However, you could potentially implement a workaround by introducing a custom error type within your workflow code. This error type could be raised if locking fails due to context timeout and would provide more specific information to the caller.

We will consider adding a dedicated error type for this scenario in future releases.

dhiaayachi commented 4 weeks ago

Thank you for reporting this issue! This is definitely something we want to address.

We understand the current behavior can make it challenging to troubleshoot workflow lock timeouts. We are actively exploring options to improve the error messaging in this scenario.

In the meantime, you can try these workarounds:

We'll keep you updated on the progress of addressing this issue.