Seagate / cortx-hare

CORTX Hare configures Motr object store, starts/stops Motr services, and notifies Motr of service and device faults.
https://github.com/Seagate/cortx
Apache License 2.0
13 stars 80 forks source link

CORTX-29311: Hare-mini-prov: include missing hare kvs to be deleted during cleanup #2024

Closed supriyachavan4398 closed 2 years ago

supriyachavan4398 commented 2 years ago

Some of the newly added Hare-Consul key values are not included in the list of keys to be deleted as part of the Hare mini-provisioner cleanup stage, e.g. byte-count.

Solution: Removed remaining key values from consul kv as part of Hare mini-provisioner cleanup stage.

Signed-off-by: Supriya Yadav supriya.s.chavan@seagate.com

supriyachavan4398 commented 2 years ago

Created custom build for new changes: https://eos-jenkins.colo.seagate.com/job/GitHub-custom-ci-builds/job/generic/job/custom-ci/5269 Did 3N deployment with the above image: https://eos-jenkins.colo.seagate.com/job/Cortx-Automation/job/RGW/job/setup-cortx-rgw-cluster/732

[root@ssc-vm-g4-rhev4-0710 ~]# kubectl get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
consul-client-hkwxm                                  1/1     Running   0          79m
consul-client-mvgqn                                  1/1     Running   0          80m
consul-client-qs48t                                  1/1     Running   0          80m
consul-server-0                                      1/1     Running   0          78m
consul-server-1                                      1/1     Running   0          79m
consul-server-2                                      1/1     Running   0          80m
cortx-control-8bd96dcd6-2mwfb                        1/1     Running   0          68m
cortx-data-ssc-vm-g4-rhev4-0710-7bc98988dc-cnz8r     4/4     Running   0          67m
cortx-data-ssc-vm-g4-rhev4-0711-6c9fdcdd9b-srzr9     4/4     Running   0          67m
cortx-data-ssc-vm-g4-rhev4-0712-5d9845dcdf-dvhpx     4/4     Running   0          67m
cortx-ha-66579d7d68-hsn9q                            3/3     Running   0          64m
cortx-server-ssc-vm-g4-rhev4-0710-88b668484-qkv9h    2/2     Running   0          65m
cortx-server-ssc-vm-g4-rhev4-0711-69b99b4666-98mvm   2/2     Running   0          65m
cortx-server-ssc-vm-g4-rhev4-0712-6bb6f987d6-jj7lz   2/2     Running   0          65m
kafka-0                                              1/1     Running   0          72m
kafka-1                                              1/1     Running   0          72m
kafka-2                                              1/1     Running   0          72m
openldap-0                                           1/1     Running   0          80m
openldap-1                                           1/1     Running   0          79m
openldap-2                                           1/1     Running   0          77m
zookeeper-0                                          1/1     Running   0          76m
zookeeper-1                                          1/1     Running   0          76m
zookeeper-2                                          1/1     Running   0          76m
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-7bc98988dc-cnz8r -c cortx-hax -- hctl status
Bytecount:
    critical : 0
    damaged : 0
    degraded : 0
    healthy : 0
Data pool:
    # fid name
    0x6f00000000000001:0x93 'storage-set-1__sns'
Profile:
    # fid name: pool(s)
    0x7000000000000001:0xd2 'Profile_the_pool': 'storage-set-1__sns' 'storage-set-1__dix' None
Services:
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax        0x7200000000000001:0x2b  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  ioservice  0x7200000000000001:0x2e  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21001
    [started]  ioservice  0x7200000000000001:0x3b  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21002
    [started]  confd      0x7200000000000001:0x48  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax        0x7200000000000001:0x7   inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  ioservice  0x7200000000000001:0xa   inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21001
    [started]  ioservice  0x7200000000000001:0x17  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21002
    [started]  confd      0x7200000000000001:0x24  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0711  (RC)
    [started]  hax        0x7200000000000001:0x4f  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  ioservice  0x7200000000000001:0x52  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21001
    [started]  ioservice  0x7200000000000001:0x5f  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21002
    [started]  confd      0x7200000000000001:0x6c  inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21003
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax        0x7200000000000001:0x71  inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  rgw        0x7200000000000001:0x74  inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax        0x7200000000000001:0x79  inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  rgw        0x7200000000000001:0x7c  inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax        0x7200000000000001:0x81  inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  rgw        0x7200000000000001:0x84  inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@21501
[root@ssc-vm-g4-rhev4-0710 ~]#
mssawant commented 2 years ago

@supriyachavan4398, I see 6 commits and most of them unrelated. Can you please try rebasing your branch with latest dg_bytecount?

supriyachavan4398 commented 2 years ago

@supriyachavan4398, I see 6 commits and most of them unrelated. Can you please try rebasing your branch with latest dg_bytecount?

rebased it, Thanks!

supriyachavan4398 commented 2 years ago

After rebasing, created a new custom build: https://eos-jenkins.colo.seagate.com/job/GitHub-custom-ci-builds/job/generic/job/custom-ci/5289/ Did deployment with below docker image:

cortx-docker.colo.seagate.com/seagate/cortx-all:2.0.0-5289-custom-ci
cortx-docker.colo.seagate.com/seagate/cortx-rgw:2.0.0-5289-custom-ci
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
consul-client-bqlnv                                  1/1     Running   0          11m
consul-client-nd5h8                                  1/1     Running   0          10m
consul-client-sp5n8                                  1/1     Running   0          10m
consul-server-0                                      1/1     Running   0          10m
consul-server-1                                      1/1     Running   0          10m
consul-server-2                                      1/1     Running   0          11m
cortx-control-bd7fd4546-vxd8m                        1/1     Running   0          5m13s
cortx-data-ssc-vm-g4-rhev4-0710-79ff547b99-n8n55     4/4     Running   0          4m23s
cortx-data-ssc-vm-g4-rhev4-0711-86c7ffcbf6-ws24g     4/4     Running   0          4m23s
cortx-data-ssc-vm-g4-rhev4-0712-676f66954f-c5zxq     4/4     Running   0          4m22s
cortx-ha-b9dd98cd4-5nfkg                             3/3     Running   0          2m12s
cortx-server-ssc-vm-g4-rhev4-0710-59b48c4df8-l7pgw   2/2     Running   0          3m6s
cortx-server-ssc-vm-g4-rhev4-0711-8644b66f4d-2hrwh   2/2     Running   0          3m6s
cortx-server-ssc-vm-g4-rhev4-0712-6998c9785b-c2v2x   2/2     Running   0          3m6s
kafka-0                                              1/1     Running   0          7m17s
kafka-1                                              1/1     Running   0          7m17s
kafka-2                                              1/1     Running   0          7m17s
openldap-0                                           1/1     Running   0          11m
openldap-1                                           1/1     Running   0          10m
openldap-2                                           1/1     Running   0          10m
zookeeper-0                                          1/1     Running   0          9m6s
zookeeper-1                                          1/1     Running   0          9m6s
zookeeper-2                                          1/1     Running   0          9m6s

[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-79ff547b99-n8n55 -c cortx-hax -- /bin/bash
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# hctl status
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
Bytecount:
    critical : 0
    damaged : 0
    degraded : 0
    healthy : 0
Data pool:
    # fid name
    0x6f00000000000001:0x93 'storage-set-1__sns'
Profile:
    # fid name: pool(s)
    0x7000000000000001:0xd2 'Profile_the_pool': 'storage-set-1__sns' 'storage-set-1__dix' None
Services:
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x2b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  ioservice           0x7200000000000001:0x2e         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21001
    [started]  ioservice           0x7200000000000001:0x3b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21002
    [started]  confd               0x7200000000000001:0x48         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax                 0x7200000000000001:0x7          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  ioservice           0x7200000000000001:0xa          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21001
    [started]  ioservice           0x7200000000000001:0x17         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21002
    [started]  confd               0x7200000000000001:0x24         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0712  (RC)
    [started]  hax                 0x7200000000000001:0x4f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  ioservice           0x7200000000000001:0x52         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21001
    [started]  ioservice           0x7200000000000001:0x5f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21002
    [started]  confd               0x7200000000000001:0x6c         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21003
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax                 0x7200000000000001:0x71         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  rgw                 0x7200000000000001:0x74         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax                 0x7200000000000001:0x79         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  rgw                 0x7200000000000001:0x7c         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x81         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  rgw                 0x7200000000000001:0x84         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@21501
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0

Redeployed the above image 3-to 4 times. Still, I am not getting any stale entries in Consul.

[root@ssc-vm-g4-rhev4-0710 ~]# kubectl get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
consul-client-kvpx2                                  1/1     Running   0          23m
consul-client-slz62                                  1/1     Running   0          22m
consul-client-vb457                                  1/1     Running   0          23m
consul-server-0                                      1/1     Running   0          22m
consul-server-1                                      1/1     Running   0          22m
consul-server-2                                      1/1     Running   0          23m
cortx-control-5bf4bb7ff6-vxlq7                       1/1     Running   0          20m
cortx-data-ssc-vm-g4-rhev4-0710-844dbc55c5-4xngb     4/4     Running   0          19m
cortx-data-ssc-vm-g4-rhev4-0711-5cf96d85dd-x5qxt     4/4     Running   0          19m
cortx-data-ssc-vm-g4-rhev4-0712-5d9d976849-f6trt     4/4     Running   0          19m
cortx-ha-79f4c987b8-td5ls                            3/3     Running   0          17m
cortx-server-ssc-vm-g4-rhev4-0710-d4cd9c666-fkffk    2/2     Running   0          18m
cortx-server-ssc-vm-g4-rhev4-0711-65db965656-sj2wz   2/2     Running   0          18m
cortx-server-ssc-vm-g4-rhev4-0712-5f6cf65f5b-ql8bn   2/2     Running   0          18m
kafka-0                                              1/1     Running   0          22m
kafka-1                                              1/1     Running   0          22m
kafka-2                                              1/1     Running   0          22m
openldap-0                                           1/1     Running   0          23m
openldap-1                                           1/1     Running   0          23m
openldap-2                                           1/1     Running   0          23m
zookeeper-0                                          1/1     Running   0          23m
zookeeper-1                                          1/1     Running   0          23m
zookeeper-2                                          1/1     Running   0          23m
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-844dbc55c5-4xngb -c cortx-hax -- hctl status
Bytecount:
    critical : 0
    damaged : 0
    degraded : 0
    healthy : 0
Data pool:
    # fid name
    0x6f00000000000001:0x93 'storage-set-1__sns'
Profile:
    # fid name: pool(s)
    0x7000000000000001:0xd2 'Profile_the_pool': 'storage-set-1__sns' 'storage-set-1__dix' None
Services:
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax                 0x7200000000000001:0x2b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  ioservice           0x7200000000000001:0x2e         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21001
    [started]  ioservice           0x7200000000000001:0x3b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21002
    [started]  confd               0x7200000000000001:0x48         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x7          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  ioservice           0x7200000000000001:0xa          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21001
    [started]  ioservice           0x7200000000000001:0x17         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21002
    [started]  confd               0x7200000000000001:0x24         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0710  (RC)
    [started]  hax                 0x7200000000000001:0x4f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  ioservice           0x7200000000000001:0x52         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21001
    [started]  ioservice           0x7200000000000001:0x5f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21002
    [started]  confd               0x7200000000000001:0x6c         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21003
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax                 0x7200000000000001:0x71         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  rgw                 0x7200000000000001:0x74         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax                 0x7200000000000001:0x79         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  rgw                 0x7200000000000001:0x7c         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x81         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  rgw                 0x7200000000000001:0x84         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@21501
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-844dbc55c5-4xngb -c cortx-hax -- /bin/bash
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
consul-client-6mmqw                                  1/1     Running   0          17m
consul-client-btfk9                                  1/1     Running   0          17m
consul-client-nzfc2                                  1/1     Running   0          16m
consul-server-0                                      1/1     Running   0          15m
consul-server-1                                      1/1     Running   0          16m
consul-server-2                                      1/1     Running   0          17m
cortx-control-5744b4765b-g56sf                       1/1     Running   0          14m
cortx-data-ssc-vm-g4-rhev4-0710-69484c8b59-t8v6l     4/4     Running   0          13m
cortx-data-ssc-vm-g4-rhev4-0711-69c5447db8-vtbh8     4/4     Running   0          13m
cortx-data-ssc-vm-g4-rhev4-0712-6bcf478b77-kthjk     4/4     Running   0          13m
cortx-ha-596f6f95f6-bs7wt                            3/3     Running   0          11m
cortx-server-ssc-vm-g4-rhev4-0710-d965bc484-qp98r    2/2     Running   0          12m
cortx-server-ssc-vm-g4-rhev4-0711-c844f5db-srnzh     2/2     Running   0          12m
cortx-server-ssc-vm-g4-rhev4-0712-748777c976-rxfzz   2/2     Running   0          12m
kafka-0                                              1/1     Running   0          16m
kafka-1                                              1/1     Running   0          16m
kafka-2                                              1/1     Running   0          16m
openldap-0                                           1/1     Running   0          17m
openldap-1                                           1/1     Running   0          17m
openldap-2                                           1/1     Running   0          17m
zookeeper-0                                          1/1     Running   0          16m
zookeeper-1                                          1/1     Running   0          16m
zookeeper-2                                          1/1     Running   0          16m
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-69484c8b59-t8v6l -c cortx-hax -- hctl status
Bytecount:
    critical : 0
    damaged : 0
    degraded : 0
    healthy : 0
Data pool:
    # fid name
    0x6f00000000000001:0x93 'storage-set-1__sns'
Profile:
    # fid name: pool(s)
    0x7000000000000001:0xd2 'Profile_the_pool': 'storage-set-1__sns' 'storage-set-1__dix' None
Services:
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x2b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  ioservice           0x7200000000000001:0x2e         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21001
    [started]  ioservice           0x7200000000000001:0x3b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21002
    [started]  confd               0x7200000000000001:0x48         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax                 0x7200000000000001:0x7          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  ioservice           0x7200000000000001:0xa          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21001
    [started]  ioservice           0x7200000000000001:0x17         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21002
    [started]  confd               0x7200000000000001:0x24         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0710  (RC)
    [started]  hax                 0x7200000000000001:0x4f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  ioservice           0x7200000000000001:0x52         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21001
    [started]  ioservice           0x7200000000000001:0x5f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21002
    [started]  confd               0x7200000000000001:0x6c         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21003
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax                 0x7200000000000001:0x71         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  rgw                 0x7200000000000001:0x74         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x79         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  rgw                 0x7200000000000001:0x7c         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax                 0x7200000000000001:0x81         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  rgw                 0x7200000000000001:0x84         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@21501
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-69484c8b59-t8v6l -c cortx-hax -- /bin/bash
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0

cc. @mssawant, @Shreya-18, @SwapnilGaonkar7, @vaibhavparatwar

vaibhavparatwar commented 2 years ago

@SwapnilGaonkar7 @mssawant please review

SwapnilGaonkar7 commented 2 years ago

@supriyachavan4398 Is below output after cleanup stage is run?

[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0
supriyachavan4398 commented 2 years ago

@supriyachavan4398 Is below output after cleanup stage is run?

[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0

No, it's after deployment only. I tried to do multiple deployments with the same docker image and after that checked for any stale entries in Consul kv.

supriyachavan4398 commented 2 years ago

@supriyachavan4398 Is below output after cleanup stage is run?

[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0

After the cleanup stage in LC env, all keys were removed which is added by hare.

[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-5ff884bcf9-v49nf -c cortx-hax -- /bin/bash
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# /opt/seagate/cortx/hare/bin/hare_setup cleanup --config yaml:///etc/cortx/cluster.conf --services all
2022-03-16 11:22:49,767 [INFO] Entering cleanup_disks_info at line 143 in file /opt/seagate/cortx/hare/lib64/python3.6/site-packages/hax/util.py
2022-03-16 11:22:49,768 [INFO] Entering Utils.get_local_hostname at line 88 in file /opt/seagate/cortx/hare/lib/python3.6/site-packages/hare_mp/utils.py
2022-03-16 11:22:49,768 [INFO] Entering Utils.get_hostname at line 76 in file /opt/seagate/cortx/hare/lib/python3.6/site-packages/hare_mp/utils.py
2022-03-16 11:22:49,768 [INFO] Leaving Utils.get_hostname
2022-03-16 11:22:49,768 [INFO] Leaving Utils.get_local_hostname
2022-03-16 11:22:49,812 [INFO] Leaving cleanup_disks_info
2022-03-16 11:22:49,814 [INFO] Entering cleanup_node_facts at line 143 in file /opt/seagate/cortx/hare/lib64/python3.6/site-packages/hax/util.py
2022-03-16 11:22:49,814 [INFO] Entering Utils.get_local_hostname at line 88 in file /opt/seagate/cortx/hare/lib/python3.6/site-packages/hare_mp/utils.py
2022-03-16 11:22:49,814 [INFO] Entering Utils.get_hostname at line 76 in file /opt/seagate/cortx/hare/lib/python3.6/site-packages/hare_mp/utils.py
2022-03-16 11:22:49,814 [INFO] Leaving Utils.get_hostname
2022-03-16 11:22:49,815 [INFO] Leaving Utils.get_local_hostname
2022-03-16 11:22:49,832 [INFO] Leaving cleanup_node_facts
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
2022-03-16 11:22:51,481 [INFO] Cluster is running, shutting down
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cluster is not running
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cluster is not running
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cluster is not running
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cluster is not running
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
^CTraceback (most recent call last):
  File "/opt/seagate/cortx/hare/bin/../libexec/hare-status", line 423, in <module>
    sys.exit(main())
  File "/opt/seagate/cortx/hare/bin/../libexec/hare-status", line 417, in main
    show_text_status(cns, cns_util, opts.devices)
  File "/opt/seagate/cortx/hare/lib64/python3.6/site-packages/hax/util.py", line 149, in wrapper
    return f(*args, **kwds)
  File "/opt/seagate/cortx/hare/bin/../libexec/hare-status", line 388, in show_text_status
    for p in processes(cns, consul_util, h):
  File "/opt/seagate/cortx/hare/bin/../libexec/hare-status", line 187, in processes
    m0_client_types = kv_item(cns, 'm0_client_types')
  File "/opt/seagate/cortx/hare/bin/../libexec/hare-status", line 87, in kv_item
    val = cns.kv.get(key, recurse=recurse)[1]
  File "/opt/seagate/cortx/hare/lib/python3.6/site-packages/consul/base.py", line 554, in get
    params=params)
  File "/opt/seagate/cortx/hare/lib/python3.6/site-packages/consul/std.py", line 22, in get
    self.session.get(uri, verify=self.verify, cert=self.cert)))
  File "/opt/seagate/cortx/hare/lib/python3.6/site-packages/requests/sessions.py", line 542, in get
    return self.request('GET', url, **kwargs)
  File "/opt/seagate/cortx/hare/lib/python3.6/site-packages/requests/sessions.py", line 520, in request
    prep.url, proxies, stream, verify, cert
  File "/opt/seagate/cortx/hare/lib/python3.6/site-packages/requests/sessions.py", line 701, in merge_environment_settings
    env_proxies = get_environ_proxies(url, no_proxy=no_proxy)
  File "/opt/seagate/cortx/hare/lib/python3.6/site-packages/requests/utils.py", line 808, in get_environ_proxies
    return getproxies()
  File "/usr/lib64/python3.6/urllib/request.py", line 2511, in getproxies_environment
    for name, value in os.environ.items():
  File "/usr/lib64/python3.6/_collections_abc.py", line 743, in __iter__
    for key in self._mapping:
  File "/usr/lib64/python3.6/os.py", line 691, in __iter__
    yield self.decodekey(key)
  File "/usr/lib64/python3.6/os.py", line 747, in decode
    return value.decode(encoding, 'surrogateescape')
KeyboardInterrupt
2022-03-16 11:23:00,669 [INFO] Deleting Hare KV entries ([KeyDelete(name='epoch', recurse=False), KeyDelete(name='eq-epoch', recurse=False), KeyDelete(name='last_fidk', recurse=False), KeyDelete(name='leader', recurse=False), KeyDelete(name='m0conf/', recurse=True), KeyDelete(name='processes/', recurse=True), KeyDelete(name='stats/', recurse=True), KeyDelete(name='mkfs/', recurse=True), KeyDelete(name='bytecount/', recurse=True), KeyDelete(name='config_path', recurse=False), KeyDelete(name='failvec', recurse=False), KeyDelete(name='m0_client_types', recurse=True)])
2022-03-16 11:23:00,802 [INFO] Entering get_log_dir at line 652 in file /opt/seagate/cortx/hare/lib/python3.6/site-packages/hare_mp/main.py
2022-03-16 11:23:00,855 [INFO] Leaving get_log_dir
2022-03-16 11:23:00,855 [INFO] Cleaning up hare log directory(/etc/cortx/log/hare/log/72e67d987a314e70b23562171cf1ce55)
rm: cannot remove '/etc/cortx/log/hare/log/72e67d987a314e70b23562171cf1ce55/hare_deployment': Is a directory
2022-03-16 11:23:00,862 [INFO] Entering get_config_dir at line 660 in file /opt/seagate/cortx/hare/lib/python3.6/site-packages/hare_mp/main.py
2022-03-16 11:23:00,915 [INFO] Leaving get_config_dir
2022-03-16 11:23:00,915 [INFO] Cleaning up hare config directory(/etc/cortx/hare/config//72e67d987a314e70b23562171cf1ce55)
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse failvec
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse config_path
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse m0_client_types
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse facts
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse drives
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#

cc. @SwapnilGaonkar7, @mssawant

vaibhavparatwar commented 2 years ago

@mssawant @SwapnilGaonkar7 please review this PR as you get chance.

supriyachavan4398 commented 2 years ago

With new changes tried to deployment on 3N setup. Docker image:

cortx-docker.colo.seagate.com/seagate/cortx-rgw:2.0.0-5337-custom-ci 
cortx-docker.colo.seagate.com/seagate/cortx-all:2.0.0-5337-custom-ci

Deployment Job at https://eos-jenkins.colo.seagate.com/job/Cortx-Automation/job/RGW/job/setup-cortx-rgw-cluster/941

[root@ssc-vm-g4-rhev4-0710 ~]# kubectl get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
consul-client-flqzm                                  1/1     Running   0          10m
consul-client-jjg9w                                  1/1     Running   0          10m
consul-client-zmw6j                                  1/1     Running   0          9m55s
consul-server-0                                      1/1     Running   0          7m53s
consul-server-1                                      1/1     Running   0          9m4s
consul-server-2                                      1/1     Running   0          10m
cortx-control-7dfffc445b-qx866                       1/1     Running   0          5m11s
cortx-data-ssc-vm-g4-rhev4-0710-84b4498787-xpj89     4/4     Running   0          4m20s
cortx-data-ssc-vm-g4-rhev4-0711-689bf95d44-jwzxn     4/4     Running   0          4m20s
cortx-data-ssc-vm-g4-rhev4-0712-54f7c68b77-tqmvm     4/4     Running   0          4m19s
cortx-ha-864c96f5d8-f4tnb                            3/3     Running   0          110s
cortx-server-ssc-vm-g4-rhev4-0710-5db78899db-rkdhg   2/2     Running   0          2m52s
cortx-server-ssc-vm-g4-rhev4-0711-85465d8d98-fps6b   2/2     Running   0          2m52s
cortx-server-ssc-vm-g4-rhev4-0712-7f8d4fb96d-pq5n7   2/2     Running   0          2m52s
kafka-0                                              1/1     Running   0          6m53s
kafka-1                                              1/1     Running   0          6m53s
kafka-2                                              1/1     Running   0          6m53s
openldap-0                                           1/1     Running   0          10m
openldap-1                                           1/1     Running   0          9m45s
openldap-2                                           1/1     Running   0          9m1s
zookeeper-0                                          1/1     Running   0          8m15s
zookeeper-1                                          1/1     Running   0          8m15s
zookeeper-2                                          1/1     Running   0          8m15s
[root@ssc-vm-g4-rhev4-0710 ~]#
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-84b4498787-xpj89 -c cortx-hax -- hctl status
Bytecount:
    critical : 0
    damaged : 0
    degraded : 0
    healthy : 0
Data pool:
    # fid name
    0x6f00000000000001:0x93 'storage-set-1__sns'
Profile:
    # fid name: pool(s)
    0x7000000000000001:0xd2 'Profile_the_pool': 'storage-set-1__sns' 'storage-set-1__dix' None
Services:
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0712  (RC)
    [started]  hax                 0x7200000000000001:0x2b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  ioservice           0x7200000000000001:0x2e         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21001
    [started]  ioservice           0x7200000000000001:0x3b         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21002
    [started]  confd               0x7200000000000001:0x48         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0712@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x7          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  ioservice           0x7200000000000001:0xa          inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21001
    [started]  ioservice           0x7200000000000001:0x17         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21002
    [started]  confd               0x7200000000000001:0x24         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0711@21003
    cortx-data-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax                 0x7200000000000001:0x4f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  ioservice           0x7200000000000001:0x52         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21001
    [started]  ioservice           0x7200000000000001:0x5f         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21002
    [started]  confd               0x7200000000000001:0x6c         inet:tcp:cortx-data-headless-svc-ssc-vm-g4-rhev4-0710@21003
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0710
    [started]  hax                 0x7200000000000001:0x71         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@22001
    [started]  rgw                 0x7200000000000001:0x74         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0710@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0711
    [started]  hax                 0x7200000000000001:0x79         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@22001
    [started]  rgw                 0x7200000000000001:0x7c         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0711@21501
    cortx-server-headless-svc-ssc-vm-g4-rhev4-0712
    [started]  hax                 0x7200000000000001:0x81         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@22001
    [started]  rgw                 0x7200000000000001:0x84         inet:tcp:cortx-server-headless-svc-ssc-vm-g4-rhev4-0712@21501
[root@ssc-vm-g4-rhev4-0710 ~]#
[root@ssc-vm-g4-rhev4-0710 ~]# kubectl exec -it cortx-data-ssc-vm-g4-rhev4-0710-84b4498787-xpj89 -c cortx-hax -- /bin/bash
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]# consul kv get -recurse bytecount
bytecount/critical:0
bytecount/damaged:0
bytecount/degraded:0
bytecount/healthy:0
[root@cortx-data-headless-svc-ssc-vm-g4-rhev4-0710 /]#

Run IOs on the setup for 1 Hour and everything works fine. cc @mssawant, @SwapnilGaonkar7, @SwapnilGaonkar7