This adds a feature flag reliable_report that optionally enables Push
message reliablity reporting. The report is done in two parts.
The first part uses a Redis like storage system to note message states.
This will require a regularly run "cleanup" script to sweep for expired
messages and adust the current counts, as well log those states to some
sequential logging friendly storage (e.g. common logging or steamed to
a file). The clean-up script should be a singleton to prevent possible
race conditions.
The second component will write a record of the state transition
times for tracked messages to a storage system that is indexed by the
tracking_id. This will allow for more "in depth" analysis by external
tooling.
The idea being that reporting will be comprised of two parts:
One part which shows active states of messages (with a log of prior
states to show trends over time), and an optional "in-depth" record
that could be used to show things like length of time in storage,
overall success rates, survivability rates, etc.
This patch also:
fixes a few typos
changes several methods that should consume Notifications, actually consume them.
convert from tracking_id to reliability_id
convert instance of specialized Metrics to generic Cadence (to make calls more consistent)
This adds a feature flag
reliable_report
that optionally enables Push message reliablity reporting. The report is done in two parts. The first part uses a Redis like storage system to note message states. This will require a regularly run "cleanup" script to sweep for expired messages and adust the current counts, as well log those states to some sequential logging friendly storage (e.g. common logging or steamed to a file). The clean-up script should be a singleton to prevent possible race conditions.The second component will write a record of the state transition times for tracked messages to a storage system that is indexed by the tracking_id. This will allow for more "in depth" analysis by external tooling.
The idea being that reporting will be comprised of two parts: One part which shows active states of messages (with a log of prior states to show trends over time), and an optional "in-depth" record that could be used to show things like length of time in storage, overall success rates, survivability rates, etc.
This patch also:
tracking_id
toreliability_id
Metrics
to generic Cadence (to make calls more consistent)RELIABLE_REPORT
flag to testing.Closes: SYNC-4327