Open mocchira opened 6 years ago
We've come to the conclusion that we will implement it as one of the leofs-adm commands recover-manager-ring
.
When I considered this implementation, I recognized LeoFS already has recover ring <storage-node>"
.
I propose recover ring
command is applied to nodes of LeoGateway
, LeoStorage
, and LeoManager
because the purpose of recovering RING is same whatever the node type.
When I considered this implementation, I recognized LeoFS already has recover ring
". I propose recover ring command is applied to nodes of LeoGateway, LeoStorage, and LeoManager because the purpose of recovering RING is same whatever the node type.
so you mean we will implement "leofs-adm recover-ring \<manager-node>|\<gateway-node>" ? then I agree with you.
The one thing we have to consider is when executing "leofs-adm recover-ring \<gateway-node>", which node (manager or storage nodes) should gateway-node retrieve the ring information from. Since it depends on the situation (which node has the correct ring information), we may consider to add a new param "from" to specify the node from which gateway-node retrieve the ring information or keep it as is (no additional params) and force users to take care of the order of commands. For example, if the ring info of manager-node and gateway-node was broken then a user have to execute the procedure below
### the execution order is important
### (if the order is reversed then it won't work)
$ leofs-adm recover-ring <manager-node>
$ leofs-adm recover-ring <gateway-node>
Seems it may make users a little bit confused so we have to document about the procedure in details if we go with this way OR we have to come up with something other solution which make users less confused.
In order to deal with cases like https://github.com/leo-project/leofs/issues/1078
Through the investigation for #1078, It turns out that we can recover the manager's RING by just issuing the command "leo_redundant_manager_api:create()." on remote_console if you are sure that the cluster member list is correct.