[Bug] Long poll requests timing out after 20 seconds

temporalio / xk6-temporal

k6 Extension for testing/benchmarking Temporal

MIT License

6 stars 3 forks source link

[Bug] Long poll requests timing out after 20 seconds #4

Closed robholland closed 2 years ago

robholland commented 2 years ago

~~Long poll requests definitely last longer than this under load so something is capping the metric, guessing something related to Trend/Time metrics in k6 -> prometheus.~~

Seems that the SDK is timing out GetWorkflowExecutionHistory requests after 20 seconds. I think it should be 65, so not sure what's going on here.

robholland commented 2 years ago

Upon further investigation it seems that this is a timeout being set on the requests by the SDK rather than a metric issue. I've not yet been able to find where this is coming from.

robholland commented 2 years ago

Specifically I see GetWorkflowExecutionHistory requests never last longer than 20 seconds.

robholland commented 2 years ago

The 20 second timeout is expected behaviour for get history requests. The stalls in performance I saw were due to low task poller counts which meant the workers would often timeout waiting for empty task queue partitions.