uber-go / cadence-client

Framework for authoring workflows and activities running on top of the Cadence orchestration engine.
https://cadenceworkflow.io
MIT License
339 stars 128 forks source link

Handle panics while polling for tasks #1352

Closed natemort closed 2 days ago

natemort commented 3 days ago

The goroutines started in doPoll execute the logic for actually making the poll RPCs, as well as any wrapping layers around them. Notably the proto conversion logic contains panics if there are unexpected values, and if we fail to handle these panics the application crashes.

What changed?

Why?

How did you test it?

Potential risks

These panics indicate a pretty core disconnect between the client and the server state, and if encountered it's unlikely that the situation will resolve without a change to the server or the client. It seems safer to indefinitely retry these panicking requests and hope that a server-side change will resolve the issue than crashing the worker, but workers in this state will likely have a higher RPS than they would otherwise. Cadence's rate limiting support should be able to mitigate risk from that.

CLAassistant commented 3 days ago

CLA assistant check
All committers have signed the CLA.