We have an issue where expired OTKs are being counted multiple times when one of the workers attempts to delete Expired OTKs at midnight.
Due to each service running a worker we run into a race condition where each worker is attempting to count and then delete the expired keys.
Because the algorithm is as follows
Count Keys for metrics
Delete keys
if deletion was successful save metrics
What is happening is the counts are occurring at the same time, one deletion is happening before the other and both are successful one deletes the counted keys the other deletes nothing. Because both are successful the counts get persisted causing a miscount of OTKExpired.
We have an issue where expired OTKs are being counted multiple times when one of the workers attempts to delete Expired OTKs at midnight.
Due to each service running a worker we run into a race condition where each worker is attempting to count and then delete the expired keys.
Because the algorithm is as follows
if deletion was successful save metrics
What is happening is the counts are occurring at the same time, one deletion is happening before the other and both are successful one deletes the counted keys the other deletes nothing. Because both are successful the counts get persisted causing a miscount of OTKExpired.