Open AndreiPetrusMihai opened 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 55.27%. Comparing base (
f485671
) to head (def4988
). Report is 3 commits behind head on master.:exclamation: Current head def4988 differs from pull request most recent head 15a938c
Please upload reports for the commit 15a938c to get more accurate results.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Hey @pasha-codefresh, could you maybe take a look at this PR when you have some spare time? Not sure who else to ping. Thanks!
The issue:
At the moment, a message with a policy of
Update
can post a new message on Slack. This happens when the message gets sent for agroupingKey
which doesn't have a recorded timestamp, which happens when there was no previous message posted for the respectivegrouping key.This is not correct and it is quite misleading since one would expect a policy of
Update
to never be able to actually post new messages. It also means that there is no practical difference between thePostAndUpdate
andUpdate
policies when it comes to sending a message for agroupingKey
which didn't have any previous message recorded.It's important to know that just because a timestamp wasn't recorded, it doesn't mean that a message with a certain
groupingKey
wasn't previously sent. The dictionary ofgroupingKey: timestamp
is kept in-memory, so upon a complete engine restart, these records would get lost.This could be considered a breaking change if someone relied on the
Update
policy to post new messages. It could also be considered a fix if the correct behavior ofUpdate
is to never post a new message.The use-case/scenario with which this behavior was found:
We have multiple argo apps and we want to receive notifications when an error occurs. This would mean notifications for failed syncs, maybe degraded apps, etc.
At the moment this is doable, but it would be a bit hard to keep track of which apps were fixed and which weren't since the error messages are static. Even if the error for an app is now fixed, the error notification still stays in the slack channel, unchanged.
As a way to improve this experience, we want to do the following:
PostAndUpdate
. Of course, this would have agroupingKey
which is related to a certain revision.Update
. They would have the samegroupingKey
as the error message.Having these 2 notifications would basically mean that errors would get posted to the channel, and once fixed, the error messages could be updated to reflect that the issues has been solved. This makes it much easier to follow and keep note of errors that still need fixing.
At the moment this doesn't work correctly. The successful sync messages do update existing error messages, but they also get posted when there is no corresponding error message for them.