openvstorage / framework-alba-plugin

The Framework ALBA plugin extends the OpenvStorage GUI with functionality to manage ASDs (Alternate Storage Daemon) and Seagate Kinetic drives.
Other
2 stars 3 forks source link

'dict' object has no attribute 'cluster_id' #205

Closed openvstorage-ci closed 7 years ago

openvstorage-ci commented 7 years ago

From @JeffreyDevloo on September 15, 2016 8:5

Problem description

Executed the following scripts:

Setup arakoons for local backend Roubaix:

=========================================

name: hdd-roub
location: mkdir -p /mnt/hdd1/ovh/

=========================================

Setup ABM arakoons (1)

========================================= perf-roub-01

from ovs.extensions.db.arakoon.ArakoonInstaller import ArakoonInstaller
from subprocess import check_output

cluster_name = "hdd_roub_abm"
ip = "172.20.20.31"
master_ip = ip
base_dir = "/mnt/hdd1/ovh/"

info =ArakoonInstaller.create_cluster(cluster_name, 'ABM', ip, base_dir, ['albamgr_plugin'], locked=False, internal=False)
check_output('ln -s /usr/lib/alba/albamgr_plugin.cmxs {0}/arakoon/{1}/db'.format(base_dir, cluster_name), shell=True)
ArakoonInstaller.start_cluster(cluster_name=cluster_name, master_ip=master_ip, filesystem=False)
ArakoonInstaller.unclaim_cluster(cluster_name=cluster_name, master_ip=master_ip, filesystem=False, metadata=info['metadata'])

======================================== perf-roub-02

from ovs.extensions.db.arakoon.ArakoonInstaller import ArakoonInstaller
from subprocess import check_output

cluster_name = "hdd_roub_abm"
master_ip = "172.20.20.31"
new_ip = "172.20.20.32"
base_dir = "/mnt/hdd1/ovh/"
current_ips = ["172.20.20.31", "172.20.20.32"]

ArakoonInstaller.extend_cluster(master_ip, new_ip, cluster_name, base_dir, locked=False)
check_output('ln -s /usr/lib/alba/albamgr_plugin.cmxs {0}/arakoon/{1}/db'.format(base_dir, cluster_name), shell=True)
ArakoonInstaller.restart_cluster_add(cluster_name, current_ips, new_ip, False)

======================================== perf-roub-03

from ovs.extensions.db.arakoon.ArakoonInstaller import ArakoonInstaller
from subprocess import check_output

cluster_name = "hdd_roub_abm"
master_ip = "172.20.20.31"
new_ip = "172.20.20.33"
base_dir = "/mnt/hdd1/ovh/"
current_ips = ["172.20.20.31", "172.20.20.32", "172.20.20.33"]

ArakoonInstaller.extend_cluster(master_ip, new_ip, cluster_name, base_dir, locked=False)
check_output('ln -s /usr/lib/alba/albamgr_plugin.cmxs {0}/arakoon/{1}/db'.format(base_dir, cluster_name), shell=True)
ArakoonInstaller.restart_cluster_add(cluster_name, current_ips, new_ip, False)

=========================================

Setup NSM arakoons (3): perf-roub-01

=========================================

from ovs.extensions.db.arakoon.ArakoonInstaller import ArakoonInstaller
from subprocess import check_output

ip = "172.20.20.31"
master_ip = ip
base_dir = "/mnt/hdd1/ovh/"
cluster_names = ["hdd_roub_nsm_0", "hdd_roub_nsm_1", "hdd_roub_nsm_2"]
for cluster_name in cluster_names:
    info = ArakoonInstaller.create_cluster(cluster_name, 'NSM', ip, base_dir, ['nsm_host_plugin'], locked=False, internal=False)
    check_output('ln -s /usr/lib/alba/nsm_host_plugin.cmxs {0}/arakoon/{1}/db'.format(base_dir, cluster_name), shell=True)
    ArakoonInstaller.start_cluster(cluster_name=cluster_name, master_ip=master_ip, filesystem=False)
    ArakoonInstaller.unclaim_cluster(cluster_name=cluster_name, master_ip=master_ip, filesystem=False, metadata=info['metadata'])

========================================

Extend NSM clusters to `perf-roub-02`

========================================

from ovs.extensions.db.arakoon.ArakoonInstaller import ArakoonInstaller
from subprocess import check_output

master_ip = "172.20.20.31"
new_ip = "172.20.20.32"
base_dir = "/mnt/hdd1/ovh/"
current_ips = ["172.20.20.31", "172.20.20.32"]
cluster_names = ["hdd_roub_nsm_0", "hdd_roub_nsm_1", "hdd_roub_nsm_2"]
for cluster_name in cluster_names:
    ArakoonInstaller.extend_cluster(master_ip, new_ip, cluster_name, base_dir, locked=False)
    check_output('ln -s /usr/lib/alba/nsm_host_plugin.cmxs {0}/arakoon/{1}/db'.format(base_dir, cluster_name), shell=True)
    ArakoonInstaller.restart_cluster_add(cluster_name, current_ips, new_ip, False)

========================================

Extend NSM clusters to `perf-roub-03`

========================================

from ovs.extensions.db.arakoon.ArakoonInstaller import ArakoonInstaller
from subprocess import check_output

master_ip = "172.20.20.31"
new_ip = "172.20.20.33"
base_dir = "/mnt/hdd1/ovh/"
current_ips = ["172.20.20.31", "172.20.20.32", "172.20.20.33"]
cluster_names = ["hdd_roub_nsm_0", "hdd_roub_nsm_1", "hdd_roub_nsm_2"]
for cluster_name in cluster_names:
    ArakoonInstaller.extend_cluster(master_ip, new_ip, cluster_name, base_dir, locked=False)
    check_output('ln -s /usr/lib/alba/nsm_host_plugin.cmxs {0}/arakoon/{1}/db'.format(base_dir, cluster_name), shell=True)
    ArakoonInstaller.restart_cluster_add(cluster_name, current_ips, new_ip, False)

========================================

Create a backend and wait until done, then use more NSM arakoons

========================================

from ovs.lib.albacontroller import AlbaController
from ovs.dal.lists.albabackendlist import AlbaBackendList
for i in AlbaBackendList.get_albabackends():
 if i.name == "hd-roub":
  break
guid = i.guid
AlbaController.nsm_checkup(backend_guid=guid, min_nsms=3)

Got the following error during AlbaController.nsm_checkup(backend_guid=guid, min_nsms=3): 'dict' object has no attribute 'cluster_id'

Logfile:


2016-09-13 18:10:04 87100 +0200 - perf-roub-03 - 7702/140027230775104 - lib/scheduled tasks - 0 - INFO - Ensure single CHAINED mode - ID 1473783004_kKRPIwpdM8 - New task alba.nsm_checkup with params {'backend_guid': 'dd5fdbee-cadc-4b79-bb19-5fc2e2d02c73', 'min_nsms': 3} scheduled for execution
2016-09-13 18:10:05 02900 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 1 - DEBUG - Ensuring NSM safety for backend hdd-roub-abm
2016-09-13 18:10:06 97800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 2 - DEBUG - Processing NSM 0
2016-09-13 18:10:06 97800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 3 - DEBUG - NSM load OK
2016-09-13 18:10:06 97800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 4 - DEBUG - Adding new NSM
2016-09-13 18:10:12 60800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 15 - INFO - Model service: hdd-roub-nsm_1
2016-09-13 18:10:19 86500 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 28 - INFO - Model service: hdd-roub-nsm_1
2016-09-13 18:10:29 28700 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 43 - INFO - Model service: hdd-roub-nsm_1
2016-09-13 18:10:37 34500 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 49 - DEBUG - New NSM (1) added
2016-09-13 18:10:37 34500 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 50 - DEBUG - Adding new NSM
2016-09-13 18:10:43 05900 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 61 - INFO - Model service: hdd-roub-nsm_2
2016-09-13 18:10:50 56100 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 74 - INFO - Model service: hdd-roub-nsm_2
2016-09-13 18:10:59 97200 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 89 - INFO - Model service: hdd-roub-nsm_2
2016-09-13 18:11:08 09300 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 95 - DEBUG - New NSM (2) added
2016-09-13 18:11:08 25800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/scheduled tasks - 96 - INFO - Ensure single CHAINED mode - ID 1473783004_kKRPIwpdM8 - Task alba.nsm_checkup finished successfully
2016-09-13 18:39:48 64800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/scheduled tasks - 262 - INFO - Ensure single CHAINED mode - ID 1473784788_ZY7m4I04Kf - New task alba.nsm_checkup with params {'backend_guid': 'e7cfb1de-5b60-47d3-85ea-5ccfaf55ebbb', 'min_nsms': 3} scheduled for execution
2016-09-13 18:39:48 81600 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 263 - DEBUG - Ensuring NSM safety for backend flash-roub-abm
2016-09-13 18:39:50 75800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 264 - DEBUG - Processing NSM 0
2016-09-13 18:39:50 75800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 265 - DEBUG - NSM load OK
2016-09-13 18:39:50 75800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 266 - DEBUG - Adding new NSM
2016-09-13 18:39:56 69000 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 277 - INFO - Model service: flash-roub-nsm_1
2016-09-13 18:40:04 22600 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 290 - INFO - Model service: flash-roub-nsm_1
2016-09-13 18:40:14 05200 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 305 - INFO - Model service: flash-roub-nsm_1
2016-09-13 18:40:22 16800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 311 - DEBUG - New NSM (1) added
2016-09-13 18:40:22 16900 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 312 - DEBUG - Adding new NSM
2016-09-13 18:40:28 12200 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 323 - INFO - Model service: flash-roub-nsm_2
2016-09-13 18:40:35 94800 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 336 - INFO - Model service: flash-roub-nsm_2
2016-09-13 18:40:45 91100 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 351 - INFO - Model service: flash-roub-nsm_2
2016-09-13 18:40:54 02300 +0200 - perf-roub-03 - 7702/140027230775104 - lib/alba - 357 - DEBUG - New NSM (2) added
2016-09-13 18:40:54 20500 +0200 - perf-roub-03 - 7702/140027230775104 - lib/scheduled tasks - 358 - INFO - Ensure single CHAINED mode - ID 1473784788_ZY7m4I04Kf - Task alba.nsm_checkup finished successfully
2016-09-14 17:36:19 93700 +0200 - perf-roub-03 - 38443/139652408604480 - lib/scheduled tasks - 228 - INFO - Ensure single CHAINED mode - ID 1473867379_X1XgugYOZl - New task alba.nsm_checkup with params {'backend_guid': 'fb1de670-65df-4b89-b7ba-7eb8c33dc758', 'min_nsms': 3} scheduled for execution
2016-09-14 17:36:20 07800 +0200 - perf-roub-03 - 38443/139652408604480 - lib/alba - 229 - DEBUG - Ensuring NSM safety for backend hdd_roub_abm
2016-09-14 17:36:21 02300 +0200 - perf-roub-03 - 38443/139652408604480 - lib/alba - 230 - DEBUG - NSM load OK
2016-09-14 17:36:21 02300 +0200 - perf-roub-03 - 38443/139652408604480 - lib/alba - 231 - DEBUG - Externally managed NSM arakoon cluster needs to be expanded
2016-09-14 17:36:21 70500 +0200 - perf-roub-03 - 38443/139652408604480 - lib/alba - 233 - ERROR - NSM Checkup failed for backend hdd-roub. 'dict' object has no attribute 'cluster_id'
(END)

Stacktrace:


:AlbaController.nsm_checkup(backend_guid=guid, min_nsms=3)
:--
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-15-7ef4ba20c30e> in <module>()
      5   break
      6 guid = i.guid
----> 7 AlbaController.nsm_checkup(backend_guid=guid, min_nsms=3)

/usr/lib/python2.7/dist-packages/celery/local.pyc in <lambda>(x, *a, **kw)
    165     __ge__ = lambda x, o: x._get_current_object() >= o
    166     __hash__ = lambda x: hash(x._get_current_object())
--> 167     __call__ = lambda x, *a, **kw: x._get_current_object()(*a, **kw)
    168     __len__ = lambda x: len(x._get_current_object())
    169     __getitem__ = lambda x, i: x._get_current_object()[i]

/usr/lib/python2.7/dist-packages/celery/app/task.pyc in __call__(self, *args, **kwargs)
    418             if self.__self__ is not None:
    419                 return self.run(self.__self__, *args, **kwargs)
--> 420             return self.run(*args, **kwargs)
    421         finally:
    422             self.pop_request()

/opt/OpenvStorage/ovs/lib/helpers/decorators.pyc in new_function(*args, **kwargs)
    299                                                                                                                      params_info,
    300                                                                                                                      current_time - starting_time))
--> 301                             output = function(*args, **kwargs)
    302                             log_message('Task {0} finished successfully'.format(task_name))
    303                         finally:

/opt/OpenvStorage/ovs/lib/albacontroller.pyc in nsm_checkup(allow_offline, backend_guid, min_nsms)
    955                 failed_backends.append(alba_backend.name)
    956         if len(failed_backends) > 0:
--> 957             raise RuntimeError('Checking NSM failed for ALBA backends: {0}'.format(', '.join(failed_backends)))
    958 
    959     @staticmethod

RuntimeError: Checking NSM failed for ALBA backends: hdd-roub

Possible root of the problem

A new step was required with the Arakoon as config manager: ArakoonInstaller.unclaim_cluster(cluster_name=cluster_name, master_ip=master_ip, filesystem=False, metadata=info['metadata']) Perhaps the issue lies there

Setup

Geoclustered setup: OVH

Package information

Copied from original issue: openvstorage/framework#906

JeffreyDevloo commented 7 years ago

TLDR: there is a key: cluster_id missing that is required in AlbaController.nsm_checkup. (Line 888 selection_076 The metadata dict looks like this: {u'cluster_name': u'hdd_roub_nsm_2', u'cluster_type': u'NSM', u'internal': False, u'in_use': True})

khenderick commented 7 years ago

Fixed in #206, packaged in openvstorage-backend-1.7.3-rev.699.85f8422

JeffreyDevloo commented 7 years ago

Steps

During the installation of OVH, I executed the same script as linked above. This time with no error.

Test results

Test passed.