Seagate / cortx-hare

CORTX Hare configures Motr object store, starts/stops Motr services, and notifies Motr of service and device faults.
https://github.com/Seagate/cortx
Apache License 2.0
13 stars 80 forks source link

Cortx-29485 SNS Repair: hctl repair start command is not triggering start request #2068

Closed d-nayak closed 2 years ago

d-nayak commented 2 years ago

Introducing bash variables for HAX node name and HAX port in the sns-repair and sns-rebalance scripts.

cla-bot[bot] commented 2 years ago

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: root. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails
cortx-admin commented 2 years ago

Can one of the admins verify this patch?

vaibhavparatwar commented 2 years ago

@mssawant Looks like similar issue that @DmitryKuzmenko was facing for cla sign?

DmitryKuzmenko commented 2 years ago

@vaibhavparatwar I shared my experience to Deepak via Teams.

mssawant commented 2 years ago

@d-nayak, kindly Fixup the multiple commits and fix the dco check. Also please avoid merging main into main, instead do rebase.

mssawant commented 2 years ago

@d-nayak, kindly post the testing results.

vaibhavparatwar commented 2 years ago

@nkommuri please add test results from your testing for this PR and also review the PR cc @d-nayak

nkommuri commented 2 years ago

Tested this customer branch in 3 node k8s cluster. motr did receive repair trigger....

[root@cortx-data-headless-svc-ssc-vm-g2-rhev4-3105 c0ae5405c1d7436eb322a872130631c1]# grep -i repair hare-hax.log | grep -v ModifyIndex 2022-05-05 02:52:28,197 [DEBUG] {ThreadPoolExecutor-0_22} Message #3 received: {"message_type": "M0_HA_MSG_NVEC", "payload": {"node": "cortx-data-headless-svc-ssc-vm-g2-rhev4-3106", "source_type": "drive", "device": "/dev/sdc", "state": "repair"}} (type: str) 2022-05-05 02:52:28,220 [DEBUG] {ThreadPoolExecutor-0_22} HA broadcast, node: cortx-data-headless-svc-ssc-vm-g2-rhev4-3106 device: /dev/sdc state: repair 2022-05-05 02:52:28,221 [DEBUG] {qconsumer-10} Got BroadcastHAStates(group=968, states=[HAState(fid=0x6b00000000000001:0xb2, status=REPAIR)], reply_to=<queue.Queue object at 0x7f2870089470>) message from planner 2022-05-05 02:52:28,222 [INFO] {qconsumer-10} HA states: [HAState(fid=0x6b00000000000001:0xb2, status=REPAIR)] 2022-05-05 02:52:28,223 [DEBUG] {qconsumer-10} Broadcasting HA states [HAState(fid=0x6b00000000000001:0xb2, status=REPAIR)] over ha_link 2022-05-05 02:52:28,321 [DEBUG] {qconsumer-10} Setting sdev=0x6400000000000001:0xb1 in KV with state=repairing 2022-05-05 03:07:56,768 [DEBUG] {MainThread} process_sns_operation: repair-start 2022-05-05 03:07:56,770 [DEBUG] {qconsumer-30} Got SnsRepairStart(group=1230, fid=0x6f00000000000001:0xc3) message from planner 2022-05-05 03:07:56,773 [INFO] {qconsumer-30} Requesting SNS repair start 2022-05-05 03:07:56,774 [DEBUG] {qconsumer-30} Initiating repair for pool 0x6f00000000000001:0xc3 2022-05-05 03:07:58,772 [DEBUG] {qconsumer-30} Repairing started for pool 0x6f00000000000001:0xc3