Seagate / cortx-hare

CORTX Hare configures Motr object store, starts/stops Motr services, and notifies Motr of service and device faults.
https://github.com/Seagate/cortx
Apache License 2.0
13 stars 80 forks source link

CORTX-33705: confd and ios ports are not aligned with cluster #2149

Closed pavankrishnat closed 2 years ago

pavankrishnat commented 2 years ago

Problem: The confd and ios Port information in hctl status is not matching with cluster.conf in both Both basic Linux and K8s

in basic Linux: ios Port in hctl status is 21003 instead of 21001 (as per cluster.conf):

    ssc-vm-rhev4-2906.colo.seagate.com  (RC)
    [started]  hax                 0x7200000000000001:0x0          inet:tcp:10.230.240.238@22001
    [started]  confd               0x7200000000000001:0x1          inet:tcp:10.230.240.238@21002
    [started]  ioservice           0x7200000000000001:0x2          inet:tcp:10.230.240.238@21003
    [unknown]  m0_client_other     0x7200000000000001:0x3          inet:tcp:10.230.240.238@22501
    [unknown]  m0_client_other     0x7200000000000001:0x4          inet:tcp:10.230.240.238@22502

In K8s: both confd and ios Port are not matching with cluster.conf

    cortx-data-g1-1.cortx-data-headless.default.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x0          inet:tcp:cortx-data-g1-1.cortx-data-headless.default.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x1          inet:tcp:cortx-data-g1-1.cortx-data-headless.default.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0x2          inet:tcp:cortx-data-g1-1.cortx-data-headless.default.svc.cluster.local@21002

Solution: Modified ios and confd port info to fetch default port.

For K8s, just by changing order in list fixes this issue. provisioning\miniprov\hare_mp\cdf.py m0serverT = ['confd', 'ios']

but to fix in basic Linux, updated the default ports and avoid rounding off to fetch actual ports from cluster.conf.

After changes: on basic Linux:

Services:
    ssc-vm-rhev4-2906.colo.seagate.com  (RC)
    [started]  hax                 0x7200000000000001:0x0          inet:tcp:10.230.240.238@22001
    [started]  confd               0x7200000000000001:0x1          inet:tcp:10.230.240.238@21001
    [started]  ioservice           0x7200000000000001:0x2          inet:tcp:10.230.240.238@21002
    [unknown]  m0_client_other     0x7200000000000001:0x3          inet:tcp:10.230.240.238@22501
    [unknown]  m0_client_other     0x7200000000000001:0x4          inet:tcp:10.230.240.238@22502

On K8s

Services:
    cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x0          inet:tcp:cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x1          inet:tcp:cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0x2          inet:tcp:cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local@21002
    cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x3          inet:tcp:cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x4          inet:tcp:cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0x5          inet:tcp:cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local@21002
    cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x6          inet:tcp:cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x7          inet:tcp:cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0x8          inet:tcp:cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local@21002
    cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x9          inet:tcp:cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0xa          inet:tcp:cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0xb          inet:tcp:cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local@21002
    cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local  (RC)
    [started]  hax                 0x7200000000000001:0xc          inet:tcp:cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0xd          inet:tcp:cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0xe          inet:tcp:cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local@21002
    cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0xf          inet:tcp:cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x10         inet:tcp:cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local@21001
    [started]  confd               0x7200000000000001:0x11         inet:tcp:cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local@21002
    cortx-server-0.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x12         inet:tcp:cortx-server-0.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x13         inet:tcp:cortx-server-0.cortx-server-headless.cortx.svc.cluster.local@22501
    cortx-server-1.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x14         inet:tcp:cortx-server-1.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x15         inet:tcp:cortx-server-1.cortx-server-headless.cortx.svc.cluster.local@22501
    cortx-server-2.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x16         inet:tcp:cortx-server-2.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x17         inet:tcp:cortx-server-2.cortx-server-headless.cortx.svc.cluster.local@22501

Signed-off-by: pavankrishnat pavan.k.thunuguntla@seagate.com

vaibhavparatwar commented 2 years ago

@pavankrishnat can you add problem statement in the commit message?

mssawant commented 2 years ago

@pavankrishnat, I am not exactly able to understand what was the problem looking at the commit message and changes. Changes include updating the default ports and avoid rounding off. But how this fixes the invalid value read from conf-store is not clear. Please elaborate more about the problem and solution in the commit message.

pavankrishnat commented 2 years ago

@pavankrishnat, I am not exactly able to understand what was the problem looking at the commit message and changes. Changes include updating the default ports and avoid rounding off. But how this fixes the invalid value read from conf-store is not clear. Please elaborate more about the problem and solution in the commit message.

Updated

pavankrishnat commented 2 years ago

retest this please

vaibhavparatwar commented 2 years ago

@mssawant could you please review this and merge if all looks good?

mssawant commented 2 years ago

@pavankrishnat, PR description looks good, please also update the commit message accordingly, I don't see commit message aligned with the PR description.

pavankrishnat commented 2 years ago

Please update the commit message to align with the PR description. Thank you.

Updated the commit message.