Seagate / cortx-ha

CORTX ha (High-Availability) is responsible for ensuring that CORTX Solution is available in case of any hardware component or software service failures. It takes care of failover/ failback control flow for affected services and stabilizes them across CORTX cluster.
https://github.com/Seagate/cortx
GNU Affero General Public License v3.0
4 stars 45 forks source link

CORTX-28560 & CORTX-29057 : Fix duplicate events published from k8s_monitor & Event parser issue #645

Closed mariyappanp closed 2 years ago

mariyappanp commented 2 years ago

Problem Statement

CORTX-28560 - k8s_monitor is repeating previously sent alert event there is no status change on monitoring objects. CORTX-29057 - Unpacking issue with Event parser when it returns None

Design

CORTX-28560 - Fix is to publish only new alert by checking whether incoming alert is already sent or not. CORTX-29057 - Return required number of values as client code expects

Coding

Testing

Review Checklist

Review Checklist

Documentation

Checklist for Author

ArchanaLimaye commented 2 years ago

Please add testing details

mariyappanp commented 2 years ago

Please add testing details Logs are verified. Log file is attached to the bug. https://jts.seagate.com/secure/attachment/505658/DEV_fix_log_23FEB2022.txt

mariyappanp commented 2 years ago

please check Madhuri's comments, otherwise, the logic looks good. Q. What if for a particular pod if offline and online both events repeated. have you checked to restart the same pod multiple times during a few minutes gap of you can test the same with mock also?

Yes, acknowledged.