openvstorage / alba

Open vStorage ALBA (alternate backend) creates a replicated or flexible network raid’ed object storage backend out of Seagate Kinetic drives and local disk supporting compression, encryption.
Other
28 stars 10 forks source link

list-all-osds lists OSDs previously purged #582

Open kvanhijf opened 7 years ago

kvanhijf commented 7 years ago

This issue occurs when having multiple ALBA Backends

In [41]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:from ovs.extensions.plugins.albacli import AlbaCLI
:
:config1 = 'arakoon://config/ovs/arakoon/bend1-abm/config?ini=%2Fopt%2FOpenvStorage%2Fconfig%2Farakoon_cacc.ini'
:config2 = 'arakoon://config/ovs/arakoon/bend2-abm/config?ini=%2Fopt%2FOpenvStorage%2Fconfig%2Farakoon_cacc.ini'
:asds1 = sorted(AlbaCLI.run(command='list-all-osds', config=config1, named_params={'node-id': 'CSRhmZt549qcKXwdW3PvfEcNlFsvr8aU'}), key=lambda k: k['long_id'])
:asds2 = sorted(AlbaCLI.run(command='list-all-osds', config=config2, named_params={'node-id': 'CSRhmZt549qcKXwdW3PvfEcNlFsvr8aU'}), key=lambda k: k['long_id'])
:
:for asd in asds1:
:    print asd['long_id'], asd['decommissioned'], asd['alba_id']
:print ''
:for asd in asds2:
:    print asd['long_id'], asd['decommissioned'], asd['alba_id']
:--
GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq False 47b887a4-770c-47c4-8027-44090c3c9098
Gr59Ibls3QClaCcHh6YgPAt85y3Ovt0O False None
KoL16B186t2U6Est7IZ8zQHhgon84Gii False None
L5olADCC2ZQLxXDxJCB9LYSlWSZb1Ygf False 47b887a4-770c-47c4-8027-44090c3c9098
kIGZw1aOe8Az9BP7BIOOa0sbp0ki2Kpj False eef292ff-83e5-47ae-bc5a-5f017e290731
ynQAKppbeDCM9puOwtIxOIFXgT1kNLU1 False None

GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq False 47b887a4-770c-47c4-8027-44090c3c9098
Gr59Ibls3QClaCcHh6YgPAt85y3Ovt0O False None
KoL16B186t2U6Est7IZ8zQHhgon84Gii False None
L5olADCC2ZQLxXDxJCB9LYSlWSZb1Ygf False 47b887a4-770c-47c4-8027-44090c3c9098
kIGZw1aOe8Az9BP7BIOOa0sbp0ki2Kpj False eef292ff-83e5-47ae-bc5a-5f017e290731
ynQAKppbeDCM9puOwtIxOIFXgT1kNLU1 False None

Above output shows the ASDs listed by backend bend1 and bend2 All ASDs of both nodes are identical and 2 ASD has been claimed by bend1 and 1 ASD has been claimed by bend2

Now when removing an ASD from bend1 (GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq), we see its being reported as decommissioned by bend1, but obvisouly this is not the case for bend2, as seen below

GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq True 47b887a4-770c-47c4-8027-44090c3c9098
Gr59Ibls3QClaCcHh6YgPAt85y3Ovt0O False None
KoL16B186t2U6Est7IZ8zQHhgon84Gii False None
L5olADCC2ZQLxXDxJCB9LYSlWSZb1Ygf False 47b887a4-770c-47c4-8027-44090c3c9098
kIGZw1aOe8Az9BP7BIOOa0sbp0ki2Kpj False eef292ff-83e5-47ae-bc5a-5f017e290731
mWTgD0LuxadmUpYdi2Ues4zj3HZvd0Xl False None
ynQAKppbeDCM9puOwtIxOIFXgT1kNLU1 False None

GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq False 47b887a4-770c-47c4-8027-44090c3c9098
Gr59Ibls3QClaCcHh6YgPAt85y3Ovt0O False None
KoL16B186t2U6Est7IZ8zQHhgon84Gii False None
L5olADCC2ZQLxXDxJCB9LYSlWSZb1Ygf False 47b887a4-770c-47c4-8027-44090c3c9098
kIGZw1aOe8Az9BP7BIOOa0sbp0ki2Kpj False eef292ff-83e5-47ae-bc5a-5f017e290731
mWTgD0LuxadmUpYdi2Ues4zj3HZvd0Xl False None
ynQAKppbeDCM9puOwtIxOIFXgT1kNLU1 False None

After maintenance has completely purged the ASD i get this output now

In [53]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:from ovs.extensions.plugins.albacli import AlbaCLI
:# AlbaCLI.run(command='claim-osd', config='arakoon://config/ovs/arakoon/bend2-abm/config?ini=%2Fopt%2FOpenvStorage%2Fconfig%2Farakoon_cacc.ini', named_params={'long-id': 'Vw3Ku4TKIq6A9ohGIgDIMDjURecUTQdR'})
:
:
:config1 = 'arakoon://config/ovs/arakoon/bend1-abm/config?ini=%2Fopt%2FOpenvStorage%2Fconfig%2Farakoon_cacc.ini'
:config2 = 'arakoon://config/ovs/arakoon/bend2-abm/config?ini=%2Fopt%2FOpenvStorage%2Fconfig%2Farakoon_cacc.ini'
:asds1 = sorted(AlbaCLI.run(command='list-all-osds', config=config1, named_params={'node-id': 'CSRhmZt549qcKXwdW3PvfEcNlFsvr8aU'}), key=lambda k: k['long_id'])
:asds2 = sorted(AlbaCLI.run(command='list-all-osds', config=config2, named_params={'node-id': 'CSRhmZt549qcKXwdW3PvfEcNlFsvr8aU'}), key=lambda k: k['long_id'])
:
:for asd in asds1:
:    print asd['long_id'], asd['decommissioned'], asd['alba_id']
:print ''
:for asd in asds2:
:    print asd['long_id'], asd['decommissioned'], asd['alba_id']
:--
Gr59Ibls3QClaCcHh6YgPAt85y3Ovt0O False None
KoL16B186t2U6Est7IZ8zQHhgon84Gii False None
L5olADCC2ZQLxXDxJCB9LYSlWSZb1Ygf False 47b887a4-770c-47c4-8027-44090c3c9098
kIGZw1aOe8Az9BP7BIOOa0sbp0ki2Kpj False eef292ff-83e5-47ae-bc5a-5f017e290731
mWTgD0LuxadmUpYdi2Ues4zj3HZvd0Xl False None
ynQAKppbeDCM9puOwtIxOIFXgT1kNLU1 False None

GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq False 47b887a4-770c-47c4-8027-44090c3c9098
Gr59Ibls3QClaCcHh6YgPAt85y3Ovt0O False None
KoL16B186t2U6Est7IZ8zQHhgon84Gii False None
L5olADCC2ZQLxXDxJCB9LYSlWSZb1Ygf False 47b887a4-770c-47c4-8027-44090c3c9098
kIGZw1aOe8Az9BP7BIOOa0sbp0ki2Kpj False eef292ff-83e5-47ae-bc5a-5f017e290731
mWTgD0LuxadmUpYdi2Ues4zj3HZvd0Xl False None
ynQAKppbeDCM9puOwtIxOIFXgT1kNLU1 False None

ASD with ID GZPEeAG8pf8PaFNCiXFetYJGBCEDcboq is still being reported by bend2, but no longer by bend1 The more ASDs get claimed and removed between different Backends, the more the list-all-osds output starts differing

domsj commented 7 years ago

This is (to me at least) expected. If we don't want this behaviour we should either

Probably the most workable/desirable solution is the first one (no more asd discovery).

domsj commented 7 years ago

Regarding my preference for no more asd discovery: that would also prevent issues such as #269 (maintenance from env1 trying to connect to asds from env2 - which it possibly can't even reach due to how the network is configured)

wimpers commented 7 years ago

@domsj I believe there is a command to purge OSDs from monitoring, I assume the same can be used to remove it from list-all-osds?

domsj commented 7 years ago

yes, it can be used. the question is: when should it be used? (and the answer determines wether you fall into option 2 or 3 that I mentioned)

wimpers commented 7 years ago

@kvanhijf why would you need a 'purified' list-all-osds? What is the context of this ticket. Can the extra command to purge from the monitoring be of any help?

kvanhijf commented 7 years ago

@wimpers : Reason i reported this was just because it seemed not logic to me, to see these differences between ALBA backends. They're not causing any issues for us, but it might be good for OPS to know in which cases they can end up in such behavior

wimpers commented 7 years ago

BAM needs to lay its egg. Will we still do auto discovery now that project Golden Gate is coming our way.