Seagate / cortx-ha

CORTX ha (High-Availability) is responsible for ensuring that CORTX Solution is available in case of any hardware component or software service failures. It takes care of failover/ failback control flow for affected services and stabilizes them across CORTX cluster.
https://github.com/Seagate/cortx
GNU Affero General Public License v3.0
4 stars 45 forks source link

EOS-26085: Redirect log to console #640

Closed ajay-paratmandali closed 2 years ago

ajay-paratmandali commented 2 years ago

Signed-off-by: Ajay Paratmandali ajay.paratmandali@seagate.com

Problem Statement

Design

 while True:
        output = driver_process.stdout.readline()
        if driver_process.poll() is not None:
            break
        if output:
            print(output.strip().decode("utf-8"))
    exit(driver_process.poll())

Coding

Testing

Review Checklist

Review Checklist

Documentation

Checklist for Author

mukhtar-inamdar commented 2 years ago

please update the result of the test. of currently tested as well as SIGTERM test result.

ajay-paratmandali commented 2 years ago

Test case: Test 1: Send restart signal to process send SIGTERM signals to pod

NAME                                                 READY      STATUS    RESTARTS   AGE
cortx-ha-6d4fc596d8-5mhc4                3/3         Running         1          4m4s

container in pod get restart; R

Test 2 Redirect log Check if log are redirecting

[root@ssc-vm-g4-rhev4-0574 ~]# kubectl logs cortx-ha-6d4fc596d8-rtgzk cortx-ha-fault-tolerance
The driver process with pid 8, and args ['/usr/bin/python3', '/usr/lib/python3.6/site-packages/ha/fault_tolrance_driver.py', '--start'] started successfully.2022-02-11 08:16:50 fault_tolerance [8]: INFO [init] Starervice fault_tolerance
2022-02-11 08:16:50 fault_tolerance [8]: INFO [init] MessageBus initialized as kafka
2022-02-11 08:16:50 fault_tolerance [8]: INFO [init] MessageBus initialized as kafka
2022-02-11 08:16:50 fault_tolerance [8]: INFO [<module>] Starting the Fault Tolerance Monitor with PID 8...
2022-02-11 08:16:50 fault_tolerance [8]: INFO [start] Starting the daemon for cluster_event-consumer-thread
2022-02-11 08:16:50 fault_tolerance [8]: INFO [start] The daemon cluster_event-consumer-thread started succ
2022-02-11 08:16:50 fault_tolerance [8]: INFO [start] Starting the daemon for cluster_stop-consumer-thread.
2022-02-11 08:16:50 fault_tolerance [8]: INFO [start] The daemon cluster_stop-consumer-thread started succe
2022-02-11 08:16:50 fault_tolerance [8]: INFO [join] waiting for cluster_event-consumer-thread to exit...
2022-02-11 08:16:50 fault_tolerance [8]: INFO [process_message] Received the message from message bus: b"{''node', '_resource_name': 'f59a55016af943409063de03c691ecc3', '_event_type': 'online', '_k8s_container': Noid': 'cortx-data-ssc-vm-g4-rhev4-0574-6b755bb4bf-pbnxk', '_node': 'ssc-vm-g4-rhev4-0574.colo.seagate.com', e, '_timestamp': '1644470592'}"
2022-02-11 08:16:50 fault_tolerance [8]: INFO [init_evaluators] Initialize all the health evaluator element
2022-02-11 08:16:50 fault_tolerance [8]: INFO [init_evaluators] HealthEvaluator ha.core.system_health.healtter_health_evaluator.ClusterHealthEvaluator is initalized...
ArchanaLimaye commented 2 years ago

@ajay-paratmandali Please update the Design section with the details

ArchanaLimaye commented 2 years ago

"Unit and System Tests are added" is checked but I dont see any new test files If existing files cover this change mention those.

Also list all tests that were done and their observations.