Open cgillum opened 2 months ago
Could another workaround be to change the default serialization class and/or parameters?
By default, all DTFx async methods must support cancellation token as the last parameter. From what I noticed, it is a current implementation gap, i.e. bug.
Could another workaround be to change the default serialization class and/or parameters?
Yes, but this would be a breaking change under normal circumstances. It might be possible to add some customization to look for this case, specifically, and avoid serializing CancellationToken
if we see it, though I'm not sure how exactly it would need to be implemented.
By default, all DTFx async methods must support cancellation token as the last parameter. From what I noticed, it is a current implementation gap, i.e. bug.
If this was in reference to my point about removing CancellationToken
parameters from wrapped interfaces, I was referring to user-provided interfaces, not DTFx interfaces.
By default, all DTFx async methods must support cancellation token as the last parameter. From what I noticed, it is a current implementation gap, i.e. bug.
Do we have an ETA for when this will be fixed? This is hitting us bad in our production workloads and we are forced to restart orchestrators intermittently to work around this issue.
@mkraghavan what fix are you referring to? The process crash mentioned in this issue can be worked around by removing CancellationToken
from your list of parameters. Also, the sentence you're quoting is not related and not something that is going to be changed anytime soon.
There have been several reports of CLR crashes and process hangs when using DTFx, specifically when a call to an activity function, continue-as-new, etc. passes in a parameter that contains a reference to
CancellationToken
. This also happens whenOrchestrationContext.CreateClient<T>
is used and methods in T interface withCancellationToken
in their signatures.The root issue is that the serialization code used by DTFx will serialize
CancellationToken
objects in a way that can corrupt CLR memory, resulting in unexpected behavior.The mitigation is to remove
CancellationToken
parameters from interfaces that are wrapped usingOrchestrationContext.CreateClient<T>
or from argument values that are passed toScheduleTask*
,ContinueAsNew
, etc.