Closed 1468ca0b-2a64-4fb4-8e52-ea5806644b4c closed 5 years ago
Created by: vvv
👍
Created by: vvv
Move this definition to the bottom of the file, where most of functions are, or to line 851, under BootLevel
definitions.
Created by: vvv
[optional] The assortment of (>>=)
, let ... in
and do
looks untidy.
Suggestion:
-- | Pick a Principal RM out of the available RM services.
pickPrincipalRM :: PhaseM RC l (Maybe M0.Service)
pickPrincipalRM = do
rg <- getGraph
let rms = [ svc
| proc <- M0.getM0Processes rg
, G.isConnected proc Is M0.PSOnline rg
, let svcTypes = M0.s_type <$> G.connectedTo proc M0.IsParentOf rg
, CST_CONFD `elem` svcTypes
, svc :: M0.Service <- G.connectedTo proc M0.IsParentOf rg
, M0.s_type svc == CST_RMS
]
Log.rcLog' Log.DEBUG $ "available RM services: " ++ show rms
traverse setPrincipalRMIfUnset (listToMaybe rms)
AFAIU, the elements of rms
may have different ServiceState
. Shouldn't we improve the implementation so that it tries to find M0.SSOnline
RM service?
Created by: vvv
most of the other functions in this file are monadic, so it will break the style pattern.
That g
suffix still feels like an eyesore.
👇 How about this?
principalRM :: G.Graph -> Maybe M0.Service
principalRM rg = case G.connectedFrom Is M0.PrincipalRM rg of
Just svc | M0.getState svc rg == M0.SSOnline -> Just svc
_ -> Nothing
Created by: andriytk
Ok.
Created by: andriytk
Yes, because Entrypoint Reply should always reply even when the cluster is not fully booted yet and it should contain some RM in the reply in any case. As far as I understand from how it used to be.
(Thanks about parentheses - removed them.)
Created by: andriytk
The "good" is meant in this context only. Good to try to get the Online RM service.
Anyway, will change it as you suggest.
Created by: andriytk
I tried the 1st variant before but without the rg argument (pointfree style). It did not work for some reason, so I gave up with it. :)
Done.
Created by: andriytk
Ok.
Created by: andriytk
Ok.
Created by: andriytk
Ok.
Created by: andriytk
The thing is - most of the other functions in this file are monadic, so it will break the style pattern.
Created by: andriytk
Getting nothing is possible. :) Anyway, will do as you suggest.
Created by: vvv
The name of the function is misleading, because you are not returning BoolLevel
.
AFAICS, getM0*
functions return resource objects and let the user unwrap them.
Consider replacing it with getM0BoolLevelValue
.
getM0BoolLevelValue :: G.Graph -> Maybe Int
getM0BoolLevelValue = fmap M0.unBoolLevel . G.connectedTo Cluster M0.RunLevel
Created by: vvv
1) You don't need parentheses around Just svc
.
2)
not (isGoodBootLevel g) || Just svc == getPrincipalRMg g
means “BoolLevel
is not connected to Cluster
|| BoolLevel 0
is connected to Cluster
|| svc
is PrincipalRM
”.
Is this what you need? You want the resulting list to contain parameters of RM services that haven't started yet?
Created by: vvv
There is nothing good or bad about boot level.
-- | Process boot level.
-- This is used both to tag processes (to indicate when they should start/stop)
-- and to tag the cluster (to indicate which processes it's valid to try to
-- start/stop).
-- Given a cluster run level of x, it is valid to start a process with a
-- boot level of <= x. So at level 0 we may start confd processes, at level
-- 1 we may start IOS etc as well as confd processes.
-- Currently:
-- * 0 - confd
-- * 1 - other
newtype BootLevel = BootLevel { unBootLevel :: Int }
deriving (Eq, Ord, Show, Generic, Hashable, Typeable, FromJSON, ToJSON)
I would suggest renaming this function to confdsHaveStarted
, or rmsHaveStarted
, or principalRMisElected
.
rmsHaveStarted :: G.Graph -> Bool
rmsHaveStarted = maybe False ((> 0) . M0.unBootLevel) . getM0BootLevel
Created by: vvv
getM0BoolLevel rg = unBoolLevel <$> G.connectedTo R.Cluster RunLevel rg
or
getM0BoolLevel = fmap unBoolLevel . G.connectedTo R.Cluster RunLevel
Created by: vvv
s/RMS/RM services/
Created by: vvv
s/RMS/PrincipalRM/
Created by: vvv
getPrincipalRMM = getPrincipalRM <$> getGraph
Created by: vvv
This g
suffix is weird. Convention is to use M
suffix for monadic function and no suffix for pure function.
Suggestion: rename getPrincipalRM
-> getPrincipalRMM
, getPrincipalRMg
-> getPrincipalRM
.
Created by: andriytk
Above I mentioned, that the 2nd update of RM when the node becomes Online back causes bad effect on the cluster (Mero processes crash, see MERO-2876). But, as appeared, the crashes happen even when RM is not updated the 2nd time (after my latest patch).
Created by: vvv
[optional] s/got //
(because you may "get" Nothing
)
Created by: andriytk
I've noticed that the principal RM is updated again when the crashed m0d becomes alive. It causes some strange effect on the cluster - many m0d processes are restarted and clients may fail at all. (See console log below.) So we should probably do something about this before landing the patch. I don't see why the second update of principal RM is necessary, so maybe we should implement some logic to avoid it.
Console log:
16:16 vagrant@cmu:halon$
16:16 vagrant@cmu:halon$ hctl mero bootstrap
...
16:19 vagrant@cmu:halon$ hctl mero status | more
Cluster disposition: ONLINE
cluster info:
SNS pool: 0x6f00000000000001:0xd8 "default"
DIX pool: 0x6f00000000000001:0x12d
profile: 0x7000000000000001:0x153
Hosts:
[ online] 0x6e00000000000001:0xe client1
[ online] 0x7200000000000001:0xf 172.28.128.13@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x10 CST_HA
[ online] 0x7300000000000001:0x11 CST_RMS
[ online] 0x7200000000000001:0x12 172.28.128.13@tcp:12345:41:301 m0t1fs
[ online] 0x7300000000000001:0x13 CST_RMS
[ online] 0x6e00000000000001:0x14 cmu
[ online] 0x7200000000000001:0x15 172.28.128.5@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x16 CST_HA
[ online] 0x7300000000000001:0x17 CST_RMS
[ N/A] 0x7200000000000001:0x18 172.28.128.5@tcp:12345:41:302 clovis-app
[ N/A] 0x7300000000000001:0x19 CST_RMS
[ N/A] 0x7200000000000001:0x1a 172.28.128.5@tcp:12345:41:303 clovis-app
[ N/A] 0x7300000000000001:0x1b CST_RMS
[ N/A] 0x7200000000000001:0x1c 172.28.128.5@tcp:12345:41:304 clovis-app
[ N/A] 0x7300000000000001:0x1d CST_RMS
[ N/A] 0x7200000000000001:0x1e 172.28.128.5@tcp:12345:41:305 clovis-app
[ N/A] 0x7300000000000001:0x1f CST_RMS
[ online] 0x6e00000000000001:0x20 ssu1
[ online] 0x7200000000000001:0x2e 172.28.128.3@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x2f CST_HA
[ online] 0x7300000000000001:0x30 CST_RMS
[ online] 0x7200000000000001:0x31 172.28.128.3@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x32 CST_CONFD
[ online] 0x7300000000000001:0x33 CST_RMS
[ online] 0x7200000000000001:0x34 172.28.128.3@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x35 CST_RMS
[ online] 0x7300000000000001:0x36 CST_IOS
[ online] 0x7300000000000001:0x37 CST_SNS_REP
[ online] 0x7300000000000001:0x38 CST_SNS_REB
[ online] 0x7300000000000001:0x39 CST_ADDB2
[ online] 0x7300000000000001:0x3a CST_CAS
[ online] 0x7300000000000001:0x3b CST_ISCS
[ online] 0x6e00000000000001:0x3c ssu2
[ online] 0x7200000000000001:0x4a 172.28.128.8@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x4b CST_HA
[ online] 0x7300000000000001:0x4c CST_RMS
[ online] 0x7200000000000001:0x4d 172.28.128.8@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x4e CST_CONFD
[ online] 0x7300000000000001:0x4f CST_RMS
[ online] 0x7200000000000001:0x50 172.28.128.8@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x51 CST_RMS
[ online] 0x7300000000000001:0x52 CST_IOS
[ online] 0x7300000000000001:0x53 CST_SNS_REP
[ online] 0x7300000000000001:0x54 CST_SNS_REB
[ online] 0x7300000000000001:0x55 CST_ADDB2
[ online] 0x7300000000000001:0x56 CST_CAS
[ online] 0x7300000000000001:0x57 CST_ISCS
[ online] 0x6e00000000000001:0x58 ssu3
[ online] 0x7200000000000001:0x66 172.28.128.7@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x67 CST_HA
[ online] 0x7300000000000001:0x68 CST_RMS
[ online] 0x7200000000000001:0x69 172.28.128.7@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x6a CST_CONFD
[ online] 0x7300000000000001:0x6b CST_RMS
[ online] 0x7200000000000001:0x6c 172.28.128.7@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x6d CST_RMS
[ online] 0x7300000000000001:0x6e CST_IOS
[ online] 0x7300000000000001:0x6f CST_SNS_REP
[ online] 0x7300000000000001:0x70 CST_SNS_REB
[ online] 0x7300000000000001:0x71 CST_ADDB2
[ online] 0x7300000000000001:0x72 CST_CAS
[ online] 0x7300000000000001:0x73 CST_ISCS
[ online] 0x6e00000000000001:0x74 ssu4
[ online] 0x7200000000000001:0x82 172.28.128.10@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x83 CST_HA
[ online] 0x7300000000000001:0x84 CST_RMS
[ online] 0x7200000000000001:0x85 172.28.128.10@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x86 CST_RMS
[ online] 0x7300000000000001:0x87 CST_IOS
[ online] 0x7300000000000001:0x88 CST_SNS_REP
[ online] 0x7300000000000001:0x89 CST_SNS_REB
[ online] 0x7300000000000001:0x8a CST_ADDB2
[ online] 0x7300000000000001:0x8b CST_CAS
[ online] 0x7300000000000001:0x8c CST_ISCS
[ online] 0x6e00000000000001:0x8d ssu5
[ online] 0x7200000000000001:0x9b 172.28.128.11@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x9c CST_HA
[ online] 0x7300000000000001:0x9d CST_RMS
[ online] 0x7200000000000001:0x9e 172.28.128.11@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x9f CST_RMS
[ online] 0x7300000000000001:0xa0 CST_IOS
[ online] 0x7300000000000001:0xa1 CST_SNS_REP
[ online] 0x7300000000000001:0xa2 CST_SNS_REB
[ online] 0x7300000000000001:0xa3 CST_ADDB2
[ online] 0x7300000000000001:0xa4 CST_CAS
[ online] 0x7300000000000001:0xa5 CST_ISCS
[ online] 0x6e00000000000001:0xa6 ssu6
[ online] 0x7200000000000001:0xb4 172.28.128.12@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0xb5 CST_HA
[ online] 0x7300000000000001:0xb6 CST_RMS
[ online] 0x7200000000000001:0xb7 172.28.128.12@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0xb8 CST_RMS
[ online] 0x7300000000000001:0xb9 CST_IOS
[ online] 0x7300000000000001:0xba CST_SNS_REP
[ online] 0x7300000000000001:0xbb CST_SNS_REB
[ online] 0x7300000000000001:0xbc CST_ADDB2
[ online] 0x7300000000000001:0xbd CST_CAS
[ online] 0x7300000000000001:0xbe CST_ISCS
[ online] 0x6e00000000000001:0xbf ssu7
[ online] 0x7200000000000001:0xcd 172.28.128.9@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0xce CST_HA
[ online] 0x7300000000000001:0xcf CST_RMS
[ online] 0x7200000000000001:0xd0 172.28.128.9@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0xd1 CST_RMS
[ online] 0x7300000000000001:0xd2 CST_IOS
[ online] 0x7300000000000001:0xd3 CST_SNS_REP
[ online] 0x7300000000000001:0xd4 CST_SNS_REB
[ online] 0x7300000000000001:0xd5 CST_ADDB2
[ online] 0x7300000000000001:0xd6 CST_CAS
[ online] 0x7300000000000001:0xd7 CST_ISCS
16:20 vagrant@cmu:halon$ ssh ssu2.local 'sudo pkill -9 halond; sudo systemctl stop halond; sudo pkill -9 m0d'
16:20 vagrant@cmu:halon$ pdsh -w cmu.local,ssu[1-7].local,client1.local sudo journalctl -u halond --since today | grep entry | sort -k 4 | tail -1
client1: Jan 31 16:18:53 client1 halond[7984]: Thu Jan 31 16:18:53 UTC 2019 pid://172.28.128.13:9070:0:131: ha_entrypoint: succeeded: SpielAddress {sa_confds_fid = [0x7300000000000001:0x4e,0x7300000000000001:0x6a,0x7300000000000001:0x32], sa_confds_ep = ["172.28.128.8@tcp:12345:44:101","172.28.128.7@tcp:12345:44:101","172.28.128.3@tcp:12345:44:101"], sa_rm_fid = 0x7300000000000001:0x4f, sa_rm_ep = "172.28.128.8@tcp:12345:44:101", sa_quorum = 2}
16:20 vagrant@cmu:halon$
16:20 vagrant@cmu:halon$
16:20 vagrant@cmu:halon$ pdsh -w cmu.local,ssu[1-7].local,client1.local sudo journalctl -u halond --since today | grep entry | sort -k 4 | tail -1
client1: Jan 31 16:18:53 client1 halond[7984]: Thu Jan 31 16:18:53 UTC 2019 pid://172.28.128.13:9070:0:131: ha_entrypoint: succeeded: SpielAddress {sa_confds_fid = [0x7300000000000001:0x4e,0x7300000000000001:0x6a,0x7300000000000001:0x32], sa_confds_ep = ["172.28.128.8@tcp:12345:44:101","172.28.128.7@tcp:12345:44:101","172.28.128.3@tcp:12345:44:101"], sa_rm_fid = 0x7300000000000001:0x4f, sa_rm_ep = "172.28.128.8@tcp:12345:44:101", sa_quorum = 2}
16:20 vagrant@cmu:halon$
16:20 vagrant@cmu:halon$
16:20 vagrant@cmu:halon$ hctl mero status | more
Cluster disposition: ONLINE
cluster info:
SNS pool: 0x6f00000000000001:0xd8 "default"
DIX pool: 0x6f00000000000001:0x12d
profile: 0x7000000000000001:0x153
Hosts:
[ online] 0x6e00000000000001:0xe client1
[ online] 0x7200000000000001:0xf 172.28.128.13@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x10 CST_HA
[ online] 0x7300000000000001:0x11 CST_RMS
[ online] 0x7200000000000001:0x12 172.28.128.13@tcp:12345:41:301 m0t1fs
[ online] 0x7300000000000001:0x13 CST_RMS
[ online] 0x6e00000000000001:0x14 cmu
[ online] 0x7200000000000001:0x15 172.28.128.5@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x16 CST_HA
[ online] 0x7300000000000001:0x17 CST_RMS
[ N/A] 0x7200000000000001:0x18 172.28.128.5@tcp:12345:41:302 clovis-app
[ N/A] 0x7300000000000001:0x19 CST_RMS
[ N/A] 0x7200000000000001:0x1a 172.28.128.5@tcp:12345:41:303 clovis-app
[ N/A] 0x7300000000000001:0x1b CST_RMS
[ N/A] 0x7200000000000001:0x1c 172.28.128.5@tcp:12345:41:304 clovis-app
[ N/A] 0x7300000000000001:0x1d CST_RMS
[ N/A] 0x7200000000000001:0x1e 172.28.128.5@tcp:12345:41:305 clovis-app
[ N/A] 0x7300000000000001:0x1f CST_RMS
[ online] 0x6e00000000000001:0x20 ssu1
[ online] 0x7200000000000001:0x2e 172.28.128.3@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x2f CST_HA
[ online] 0x7300000000000001:0x30 CST_RMS
[ online] 0x7200000000000001:0x31 172.28.128.3@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x32 CST_CONFD
[ online] 0x7300000000000001:0x33 CST_RMS
[ online] 0x7200000000000001:0x34 172.28.128.3@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x35 CST_RMS
[ online] 0x7300000000000001:0x36 CST_IOS
[ online] 0x7300000000000001:0x37 CST_SNS_REP
[ online] 0x7300000000000001:0x38 CST_SNS_REB
[ online] 0x7300000000000001:0x39 CST_ADDB2
[ online] 0x7300000000000001:0x3a CST_CAS
[ online] 0x7300000000000001:0x3b CST_ISCS
[ failed] 0x6e00000000000001:0x3c ssu2
Extended state: failed(recoverable)
[inhibited] 0x7200000000000001:0x4a 172.28.128.8@tcp:12345:34:101 halon
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x4b CST_HA
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x4c CST_RMS
Extended state: inhibited (online)
[inhibited] 0x7200000000000001:0x4d 172.28.128.8@tcp:12345:44:101 confd
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x4e CST_CONFD
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x4f CST_RMS
Extended state: inhibited (online)
[inhibited] 0x7200000000000001:0x50 172.28.128.8@tcp:12345:41:401 ioservice
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x51 CST_RMS
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x52 CST_IOS
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x53 CST_SNS_REP
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x54 CST_SNS_REB
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x55 CST_ADDB2
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x56 CST_CAS
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x57 CST_ISCS
Extended state: inhibited (online)
[ online] 0x6e00000000000001:0x58 ssu3
[ online] 0x7200000000000001:0x66 172.28.128.7@tcp:12345:34:101 halon
16:21 vagrant@cmu:halon$ pdsh -w cmu.local,ssu[1-7].local,client1.local sudo journalctl -u halond --since today | grep entry | sort -k 4 | tail -1
ssu4: Jan 31 16:21:16 ssu4 halond[7991]: Thu Jan 31 16:21:16 UTC 2019 pid://172.28.128.10:9070:0:381: ha_entrypoint: succeeded: SpielAddress {sa_confds_fid = [0x7300000000000001:0x4e,0x7300000000000001:0x6a,0x7300000000000001:0x32], sa_confds_ep = ["172.28.128.8@tcp:12345:44:101","172.28.128.7@tcp:12345:44:101","172.28.128.3@tcp:12345:44:101"], sa_rm_fid = 0x7300000000000001:0x6b, sa_rm_ep = "172.28.128.7@tcp:12345:44:101", sa_quorum = 2}
16:21 vagrant@cmu:halon$ ssh ssu2.local sudo systemctl restart halond
16:22 vagrant@cmu:halon$
16:22 vagrant@cmu:halon$
16:22 vagrant@cmu:halon$ hctl mero status | more
Cluster disposition: ONLINE
cluster info:
SNS pool: 0x6f00000000000001:0xd8 "default"
DIX pool: 0x6f00000000000001:0x12d
profile: 0x7000000000000001:0x153
Hosts:
[ failed] 0x6e00000000000001:0xe client1
Extended state: failed(recoverable)
[inhibited] 0x7200000000000001:0xf 172.28.128.13@tcp:12345:34:101 halon
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x10 CST_HA
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x11 CST_RMS
Extended state: inhibited (online)
[inhibited] 0x7200000000000001:0x12 172.28.128.13@tcp:12345:41:301 m0t1fs
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x13 CST_RMS
Extended state: inhibited (online)
[ online] 0x6e00000000000001:0x14 cmu
[ online] 0x7200000000000001:0x15 172.28.128.5@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x16 CST_HA
[ online] 0x7300000000000001:0x17 CST_RMS
[ N/A] 0x7200000000000001:0x18 172.28.128.5@tcp:12345:41:302 clovis-app
[ N/A] 0x7300000000000001:0x19 CST_RMS
[ N/A] 0x7200000000000001:0x1a 172.28.128.5@tcp:12345:41:303 clovis-app
[ N/A] 0x7300000000000001:0x1b CST_RMS
[ N/A] 0x7200000000000001:0x1c 172.28.128.5@tcp:12345:41:304 clovis-app
[ N/A] 0x7300000000000001:0x1d CST_RMS
[ N/A] 0x7200000000000001:0x1e 172.28.128.5@tcp:12345:41:305 clovis-app
[ N/A] 0x7300000000000001:0x1f CST_RMS
[ online] 0x6e00000000000001:0x20 ssu1
[ online] 0x7200000000000001:0x2e 172.28.128.3@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x2f CST_HA
[ online] 0x7300000000000001:0x30 CST_RMS
[ online] 0x7200000000000001:0x31 172.28.128.3@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x32 CST_CONFD
[ online] 0x7300000000000001:0x33 CST_RMS
[quiescing] 0x7200000000000001:0x34 172.28.128.3@tcp:12345:41:401 ioservice
[inhibited] 0x7300000000000001:0x35 CST_RMS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x36 CST_IOS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x37 CST_SNS_REP
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x38 CST_SNS_REB
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x39 CST_ADDB2
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x3a CST_CAS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x3b CST_ISCS
Extended state: inhibited (starting)
[ online] 0x6e00000000000001:0x3c ssu2
[ online] 0x7200000000000001:0x4a 172.28.128.8@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x4b CST_HA
[ online] 0x7300000000000001:0x4c CST_RMS
[ online] 0x7200000000000001:0x4d 172.28.128.8@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x4e CST_CONFD
[ online] 0x7300000000000001:0x4f CST_RMS
[ starting] 0x7200000000000001:0x50 172.28.128.8@tcp:12345:41:401 ioservice
[ starting] 0x7300000000000001:0x51 CST_RMS
[ starting] 0x7300000000000001:0x52 CST_IOS
[ starting] 0x7300000000000001:0x53 CST_SNS_REP
[ starting] 0x7300000000000001:0x54 CST_SNS_REB
[ starting] 0x7300000000000001:0x55 CST_ADDB2
[ starting] 0x7300000000000001:0x56 CST_CAS
[ starting] 0x7300000000000001:0x57 CST_ISCS
[ online] 0x6e00000000000001:0x58 ssu3
[ online] 0x7200000000000001:0x66 172.28.128.7@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x67 CST_HA
[ online] 0x7300000000000001:0x68 CST_RMS
[ online] 0x7200000000000001:0x69 172.28.128.7@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x6a CST_CONFD
[ online] 0x7300000000000001:0x6b CST_RMS
[ starting] 0x7200000000000001:0x6c 172.28.128.7@tcp:12345:41:401 ioservice
[ starting] 0x7300000000000001:0x6d CST_RMS
[ starting] 0x7300000000000001:0x6e CST_IOS
[ starting] 0x7300000000000001:0x6f CST_SNS_REP
[ starting] 0x7300000000000001:0x70 CST_SNS_REB
[ starting] 0x7300000000000001:0x71 CST_ADDB2
[ starting] 0x7300000000000001:0x72 CST_CAS
[ starting] 0x7300000000000001:0x73 CST_ISCS
[ online] 0x6e00000000000001:0x74 ssu4
[ online] 0x7200000000000001:0x82 172.28.128.10@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x83 CST_HA
[ online] 0x7300000000000001:0x84 CST_RMS
[quiescing] 0x7200000000000001:0x85 172.28.128.10@tcp:12345:41:401 ioservice
[inhibited] 0x7300000000000001:0x86 CST_RMS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x87 CST_IOS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x88 CST_SNS_REP
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x89 CST_SNS_REB
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x8a CST_ADDB2
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x8b CST_CAS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0x8c CST_ISCS
Extended state: inhibited (starting)
[ online] 0x6e00000000000001:0x8d ssu5
[ online] 0x7200000000000001:0x9b 172.28.128.11@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x9c CST_HA
[ online] 0x7300000000000001:0x9d CST_RMS
[ starting] 0x7200000000000001:0x9e 172.28.128.11@tcp:12345:41:401 ioservice
[ starting] 0x7300000000000001:0x9f CST_RMS
[ starting] 0x7300000000000001:0xa0 CST_IOS
[ starting] 0x7300000000000001:0xa1 CST_SNS_REP
[ starting] 0x7300000000000001:0xa2 CST_SNS_REB
[ starting] 0x7300000000000001:0xa3 CST_ADDB2
[ starting] 0x7300000000000001:0xa4 CST_CAS
[ starting] 0x7300000000000001:0xa5 CST_ISCS
[ online] 0x6e00000000000001:0xa6 ssu6
[ online] 0x7200000000000001:0xb4 172.28.128.12@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0xb5 CST_HA
[ online] 0x7300000000000001:0xb6 CST_RMS
[quiescing] 0x7200000000000001:0xb7 172.28.128.12@tcp:12345:41:401 ioservice
[inhibited] 0x7300000000000001:0xb8 CST_RMS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0xb9 CST_IOS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0xba CST_SNS_REP
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0xbb CST_SNS_REB
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0xbc CST_ADDB2
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0xbd CST_CAS
Extended state: inhibited (starting)
[inhibited] 0x7300000000000001:0xbe CST_ISCS
Extended state: inhibited (starting)
[ online] 0x6e00000000000001:0xbf ssu7
[ online] 0x7200000000000001:0xcd 172.28.128.9@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0xce CST_HA
[ online] 0x7300000000000001:0xcf CST_RMS
[ starting] 0x7200000000000001:0xd0 172.28.128.9@tcp:12345:41:401 ioservice
[ starting] 0x7300000000000001:0xd1 CST_RMS
[ starting] 0x7300000000000001:0xd2 CST_IOS
[ starting] 0x7300000000000001:0xd3 CST_SNS_REP
[ starting] 0x7300000000000001:0xd4 CST_SNS_REB
[ starting] 0x7300000000000001:0xd5 CST_ADDB2
[ starting] 0x7300000000000001:0xd6 CST_CAS
[ starting] 0x7300000000000001:0xd7 CST_ISCS
16:28 vagrant@cmu:halon$ pdsh -w cmu.local,ssu[1-7].local,client1.local sudo journalctl -u halond --since today | grep entry | sort -k 4 | tail -1
ssu5: Jan 31 16:28:03 ssu5 halond[7996]: Thu Jan 31 16:28:03 UTC 2019 pid://172.28.128.11:9070:0:528: ha_entrypoint: succeeded: SpielAddress {sa_confds_fid = [0x7300000000000001:0x4e,0x7300000000000001:0x6a,0x7300000000000001:0x32], sa_confds_ep = ["172.28.128.8@tcp:12345:44:101","172.28.128.7@tcp:12345:44:101","172.28.128.3@tcp:12345:44:101"], sa_rm_fid = 0x7300000000000001:0x4f, sa_rm_ep = "172.28.128.8@tcp:12345:44:101", sa_quorum = 2}
16:28 vagrant@cmu:halon$
16:39 vagrant@cmu:halon$ pdsh -w cmu.local,ssu[1-7].local,client1.local sudo journalctl -u halond --since today | grep entry | sort -k 4 | tail -1
ssu6: Jan 31 16:34:01 ssu6 halond[8024]: Thu Jan 31 16:34:01 UTC 2019 pid://172.28.128.12:9070:0:540: ha_entrypoint: succeeded: SpielAddress {sa_confds_fid = [0x7300000000000001:0x4e,0x7300000000000001:0x6a,0x7300000000000001:0x32], sa_confds_ep = ["172.28.128.8@tcp:12345:44:101","172.28.128.7@tcp:12345:44:101","172.28.128.3@tcp:12345:44:101"], sa_rm_fid = 0x7300000000000001:0x4f, sa_rm_ep = "172.28.128.8@tcp:12345:44:101", sa_quorum = 2}
16:39 vagrant@cmu:halon$
16:39 vagrant@cmu:halon$
16:39 vagrant@cmu:halon$ hctl mero status | more
Cluster disposition: ONLINE
cluster info:
SNS pool: 0x6f00000000000001:0xd8 "default"
DIX pool: 0x6f00000000000001:0x12d
profile: 0x7000000000000001:0x153
Filesystem stats:
Total space: 90,194,313,216
Free space: 90,194,313,216
Total segments: 3,765,216,448
Free segments: 3,763,334,328
Hosts:
[ failed] 0x6e00000000000001:0xe client1
Extended state: failed(recoverable)
[inhibited] 0x7200000000000001:0xf 172.28.128.13@tcp:12345:34:101 halon
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x10 CST_HA
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x11 CST_RMS
Extended state: inhibited (online)
[inhibited] 0x7200000000000001:0x12 172.28.128.13@tcp:12345:41:301 m0t1fs
Extended state: inhibited (online)
[inhibited] 0x7300000000000001:0x13 CST_RMS
Extended state: inhibited (online)
[ online] 0x6e00000000000001:0x14 cmu
[ online] 0x7200000000000001:0x15 172.28.128.5@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x16 CST_HA
[ online] 0x7300000000000001:0x17 CST_RMS
[ N/A] 0x7200000000000001:0x18 172.28.128.5@tcp:12345:41:302 clovis-app
[ N/A] 0x7300000000000001:0x19 CST_RMS
[ N/A] 0x7200000000000001:0x1a 172.28.128.5@tcp:12345:41:303 clovis-app
[ N/A] 0x7300000000000001:0x1b CST_RMS
[ N/A] 0x7200000000000001:0x1c 172.28.128.5@tcp:12345:41:304 clovis-app
[ N/A] 0x7300000000000001:0x1d CST_RMS
[ N/A] 0x7200000000000001:0x1e 172.28.128.5@tcp:12345:41:305 clovis-app
[ N/A] 0x7300000000000001:0x1f CST_RMS
[ online] 0x6e00000000000001:0x20 ssu1
[ online] 0x7200000000000001:0x2e 172.28.128.3@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x2f CST_HA
[ online] 0x7300000000000001:0x30 CST_RMS
[ online] 0x7200000000000001:0x31 172.28.128.3@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x32 CST_CONFD
[ online] 0x7300000000000001:0x33 CST_RMS
[ online] 0x7200000000000001:0x34 172.28.128.3@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x35 CST_RMS
[ online] 0x7300000000000001:0x36 CST_IOS
[ online] 0x7300000000000001:0x37 CST_SNS_REP
[ online] 0x7300000000000001:0x38 CST_SNS_REB
[ online] 0x7300000000000001:0x39 CST_ADDB2
[ online] 0x7300000000000001:0x3a CST_CAS
[ online] 0x7300000000000001:0x3b CST_ISCS
[ online] 0x6e00000000000001:0x3c ssu2
[ online] 0x7200000000000001:0x4a 172.28.128.8@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x4b CST_HA
[ online] 0x7300000000000001:0x4c CST_RMS
[ online] 0x7200000000000001:0x4d 172.28.128.8@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x4e CST_CONFD
[ online] 0x7300000000000001:0x4f CST_RMS
[ online] 0x7200000000000001:0x50 172.28.128.8@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x51 CST_RMS
[ online] 0x7300000000000001:0x52 CST_IOS
[ online] 0x7300000000000001:0x53 CST_SNS_REP
[ online] 0x7300000000000001:0x54 CST_SNS_REB
[ online] 0x7300000000000001:0x55 CST_ADDB2
[ online] 0x7300000000000001:0x56 CST_CAS
[ online] 0x7300000000000001:0x57 CST_ISCS
[ online] 0x6e00000000000001:0x58 ssu3
[ online] 0x7200000000000001:0x66 172.28.128.7@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x67 CST_HA
[ online] 0x7300000000000001:0x68 CST_RMS
[ online] 0x7200000000000001:0x69 172.28.128.7@tcp:12345:44:101 confd
[ online] 0x7300000000000001:0x6a CST_CONFD
[ online] 0x7300000000000001:0x6b CST_RMS
[ online] 0x7200000000000001:0x6c 172.28.128.7@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x6d CST_RMS
[ online] 0x7300000000000001:0x6e CST_IOS
[ online] 0x7300000000000001:0x6f CST_SNS_REP
[ online] 0x7300000000000001:0x70 CST_SNS_REB
[ online] 0x7300000000000001:0x71 CST_ADDB2
[ online] 0x7300000000000001:0x72 CST_CAS
[ online] 0x7300000000000001:0x73 CST_ISCS
[ online] 0x6e00000000000001:0x74 ssu4
[ online] 0x7200000000000001:0x82 172.28.128.10@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x83 CST_HA
[ online] 0x7300000000000001:0x84 CST_RMS
[ online] 0x7200000000000001:0x85 172.28.128.10@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x86 CST_RMS
[ online] 0x7300000000000001:0x87 CST_IOS
[ online] 0x7300000000000001:0x88 CST_SNS_REP
[ online] 0x7300000000000001:0x89 CST_SNS_REB
[ online] 0x7300000000000001:0x8a CST_ADDB2
[ online] 0x7300000000000001:0x8b CST_CAS
[ online] 0x7300000000000001:0x8c CST_ISCS
[ online] 0x6e00000000000001:0x8d ssu5
[ online] 0x7200000000000001:0x9b 172.28.128.11@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0x9c CST_HA
[ online] 0x7300000000000001:0x9d CST_RMS
[ online] 0x7200000000000001:0x9e 172.28.128.11@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0x9f CST_RMS
[ online] 0x7300000000000001:0xa0 CST_IOS
[ online] 0x7300000000000001:0xa1 CST_SNS_REP
[ online] 0x7300000000000001:0xa2 CST_SNS_REB
[ online] 0x7300000000000001:0xa3 CST_ADDB2
[ online] 0x7300000000000001:0xa4 CST_CAS
[ online] 0x7300000000000001:0xa5 CST_ISCS
[ online] 0x6e00000000000001:0xa6 ssu6
[ online] 0x7200000000000001:0xb4 172.28.128.12@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0xb5 CST_HA
[ online] 0x7300000000000001:0xb6 CST_RMS
[ online] 0x7200000000000001:0xb7 172.28.128.12@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0xb8 CST_RMS
[ online] 0x7300000000000001:0xb9 CST_IOS
[ online] 0x7300000000000001:0xba CST_SNS_REP
[ online] 0x7300000000000001:0xbb CST_SNS_REB
[ online] 0x7300000000000001:0xbc CST_ADDB2
[ online] 0x7300000000000001:0xbd CST_CAS
[ online] 0x7300000000000001:0xbe CST_ISCS
[ online] 0x6e00000000000001:0xbf ssu7
[ online] 0x7200000000000001:0xcd 172.28.128.9@tcp:12345:34:101 halon
[ online] 0x7300000000000001:0xce CST_HA
[ online] 0x7300000000000001:0xcf CST_RMS
[ online] 0x7200000000000001:0xd0 172.28.128.9@tcp:12345:41:401 ioservice
[ online] 0x7300000000000001:0xd1 CST_RMS
[ online] 0x7300000000000001:0xd2 CST_IOS
[ online] 0x7300000000000001:0xd3 CST_SNS_REP
[ online] 0x7300000000000001:0xd4 CST_SNS_REB
[ online] 0x7300000000000001:0xd5 CST_ADDB2
[ online] 0x7300000000000001:0xd6 CST_CAS
[ online] 0x7300000000000001:0xd7 CST_ISCS
16:39 vagrant@cmu:halon$
Created by: andriytk
Yes.
Created by: andriytk
The idea was to use the same code for getting the principal RM.
Created by: vvv
You mean that it's not possible for online process to host offline service? Yeah, this makes sense...
Created by: vvv
Yes, this name will do. 👌
Notes:
Created by: andriytk
Or just getPrincipalRM'
(Haskell way)?
Created by: andriytk
Done.
Created by: andriytk
I like it with the case
, will do. How about getPrincipalRMfrom rg
variant?
Created by: andriytk
Created by: andriytk