Open ramonsmits opened 7 years ago
@ramonsmits I think it makes sense to have shorter retention period for audits because they are, by definition, less critical. That said, the max retention for errors should probably be longer.
It's a shame John is no longer on the team since he came up with those numbers and I don't remember the logic behind them
Regarding keeps errors longer, the retention policy only applies to errors that have been resolved. So essentially a user has successfully retried the message in which case we now have an audit record or the user has marked it as ignored. In either case I'd say it's noise that shouldn't be kept long term
@gbiellem ok, then +1 for keeping it as-is.
I do agree about the min value though - I can't think of a good reason for the ten days
Yeah the reasoning behind the min was to do with the risk of getting it wrong. Not all systems have auditing switched on so we can't say definitively that a retried message has been processed. With that in mind, the error retention policy does delete errors in the RetryIssued state.
Just because a Retry has been issued doesn't mean it went anywhere. Retried messages can be stuck in the outgoing queue on a machine, the endpoint they were retried to might have been decommissioned, they can be in a DLQ somewhere. It can take some time to figure that out and the system has a "re-retry" option (and a redirects option).
The 10 day minimum was to enforce a window for people to discover that retried messages were stuck somewhere and do something about it. Any shorter and you run the risk that a message Retried on a Friday is gone by the time someone comes in after the long weekend and realizes something is off.
Also, that's 10 days from when the error is retried. Not 10 days from when the error occurred.
Personally I'd rather see retention based on conversations. i.e. If this conversation hasn't received any new messages for the last 90 days then delete all messages associated with it.
The MAX value is 45 days, but for the audit period its 1 year and the MIN value is 10 days compared to 1 hour for the audit.
To me it makes sense that: