Closed moldovangeorge closed 2 years ago
I’ll need to investigate, but I think this might be expected for canceled timers. Canceling a timer doesn’t actually delete anything from the Durable store, it just tells the orchestration to not wait for it when transitioning into a completed state. I’ll be interested to know if they stay there forever or if they get deleted after their scheduled fire-time…
Looking at the code I don't see how would these events ever be deleted ( which is confirmed by our experience so far) :
-- We return the list of deleted messages so that the caller can issue a
-- warning about missing messages
DELETE E
OUTPUT DELETED.InstanceID, DELETED.SequenceNumber
FROM dt.NewEvents E WITH (FORCESEEK(PK_NewEvents(TaskHub, InstanceID, SequenceNumber)))
INNER JOIN @DeletedEvents D ON
D.InstanceID = E.InstanceID AND
D.SequenceNumber = E.SequenceNumber AND
E.TaskHub = @TaskHub
-- Lock the first active instance that has pending messages.
-- Delayed events from durable timers will have a non-null VisibleTime value.
-- Non-active instances will never have their messages or history read.
UPDATE TOP (1) Instances WITH (READPAST)
SET
[LockedBy] = @LockedBy,
[LockExpiration] = @LockExpiration,
@instanceID = I.[InstanceID],
@parentInstanceID = I.[ParentInstanceID],
@version = I.[Version]
FROM
dt.Instances I WITH (READPAST) INNER JOIN NewEvents E WITH (READPAST) ON
E.[TaskHub] = @TaskHub AND
E.[InstanceID] = I.[InstanceID]
WHERE
I.TaskHub = @TaskHub AND
I.[RuntimeStatus] IN ('Pending', 'Running') AND
(I.[LockExpiration] IS NULL OR I.[LockExpiration] < @now) AND
(E.[VisibleTime] IS NULL OR E.[VisibleTime] < @now)
So I think that any Events that remain linked to a finished orchestration will remain un-deleted forever. Would a new Purge procedure for cleaning these orphan Events at deployment time be a good add-on?
Our application uses timers heavily, having multiple async flows that have a deadline, and we use timers for establishing that deadline until a certain service can call back to finish an action. We follow the best practices around working with timers and we cancel the timers if we are no longer waiting for them (if we received the callback event before the deadline). One standard way of using timers looks something like that :
After a run of a workflow, the timers remain un-deleted in the NewEvents table, even though the instances related to them are finished (Completed, Failed, etc). Is this by design, or is it something in the way we are using timers that generates this behavior?