openvstorage / framework

The Framework is a set of components and tools which brings the user an interface (GUI / API) to setup, extend and manage an Open vStorage platform.
Other
27 stars 23 forks source link

cannot clone from template when the owner node is down #857

Closed jeroenmaelbrancke closed 8 years ago

jeroenmaelbrancke commented 8 years ago

We are unable to clone a template when the owner node from that template is down.

Error: vdisk

ovs version:

root@ovs-02:~# apt-cache policy openvstorage
openvstorage:
  Installed: 2.7.1-fargo.4-1
  Candidate: 2.7.1-fargo.4-1
  Version table:
 *** 2.7.1-fargo.4-1 0
        500 http://apt.openvstorage.org/ fargo/main amd64 Packages

OVS ticket: OVSSUP-78

JeffreyDevloo commented 8 years ago

Steps

2016-08-29 15:50:19 54500 +0200 - ovs-node1 - 2787/140555754481472 - celery/celery.worker.job - 128 - DEBUG - Task accepted: ovs.vdisk.create_from_template[f89d2f35-f3a8-4b76-ba8d-27f5112c3505] pid:5786

2016-08-29 15:50:20 62500 +0200 - ovs-node1 - 2787/140555754481472 - celery/celery.worker.job - 135 - INFO - Task ovs.vdisk.create_from_template[f89d2f35-f3a8-4b76-ba8d-27f5112c3505] succeeded in 1.080851792s: {'backingdevice': '/t
est.raw', 'vdisk_guid': 'a36e4331-24e0-4efd-906c-1fb19ad51ef5', 'name': u'test'}

Additional information

Recreation via the API

In [13]: VDiskController.create_from_template(vdisk.guid, 'testDisk', node.guid)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-8d9230dde24e> in <module>()
----> 1 VDiskController.create_from_template(vdisk.guid, 'testDisk', node.guid)

/usr/lib/python2.7/dist-packages/celery/local.pyc in <lambda>(x, *a, **kw)
    165     __ge__ = lambda x, o: x._get_current_object() >= o
    166     __hash__ = lambda x: hash(x._get_current_object())
--> 167     __call__ = lambda x, *a, **kw: x._get_current_object()(*a, **kw)
    168     __len__ = lambda x: len(x._get_current_object())
    169     __getitem__ = lambda x, i: x._get_current_object()[i]

/usr/lib/python2.7/dist-packages/celery/app/task.pyc in __call__(self, *args, **kwargs)
    418             if self.__self__ is not None:
    419                 return self.run(self.__self__, *args, **kwargs)
--> 420             return self.run(*args, **kwargs)
    421         finally:
    422             self.pop_request()

/opt/OpenvStorage/ovs/lib/vdisk.pyc in create_from_template(vdisk_guid, name, storagerouter_guid)
    416         # Validations
    417         if not vdisk.is_vtemplate:
--> 418             raise RuntimeError('The given vDisk is not a vTemplate')
    419         devicename = VDiskController.clean_devicename(name)
    420         if VDiskList.get_by_devicename_and_vpool(devicename, vdisk.vpool) is not None:

RuntimeError: The given vDisk is not a vTemplate

When removing the check on line 417, I could succesfully create a vdisk from the given template.

Setup

Hyperconverged setup

Package information

khenderick commented 8 years ago

Makes a lot of sense; the vTemplate is a dynamic property that checks whether the vDisk's info_volume('object_type') equals TEMPLATE. When a volume is down this information cannot be fetched and we have no idea whether the vDisk is a template or not.

khenderick commented 8 years ago

@redlicha, while I can remove this check - seems the volumedriver can protect itself in this scenario - how can I figure out whether it's a vTemplate or not? It would be nice if info_volume would return (some) data in case a volumedriver is down.

redlicha commented 8 years ago

@khenderick, you could use the ObjectRegistryClient of the volumedriver's Python API - ObjectRegistration.object_type of template volumes is ObjectType.TEMPLATE. Cf. also the documentation in https://github.com/openvstorage/volumedriver/pull/56

khenderick commented 8 years ago

@redlicha, can I use the `ObjectRegistration APIs as a replacement of the info_volume call?

redlicha commented 8 years ago

@khenderick: an ObjectRegistration offers the following:

ObjectRegistration.dtl_config_mode
ObjectRegistration.node_id
ObjectRegistration.object_type      
ObjectRegistration.object_id
ObjectRegistration.owner_tag        
wimpers commented 8 years ago

@redlicha , @khenderick is there a manual work around possible where OPS sets the new owner?

khenderick commented 8 years ago

Fixed by #879, packaged in openvstorage-2.7.3-rev.3945.833fb96

JeffreyDevloo commented 8 years ago

Steps

In [18]: VDiskController.create_from_template(disk.guid, 'test22',sr.guid)
2016-09-09 10:20:36 73600 +0200 - ovs-node2 - 973/139711040272192 - lib/vdisk - 5 - INFO - Create vDisk from vTemplate Ubuntu to new vDisk test22 to location /test22.raw
2016-09-09 10:20:39 06700 +0200 - ovs-node2 - 973/139711040272192 - lib/mds - 6 - DEBUG - MDS safety: vDisk 45b1bae8-f8c1-4001-a4e9-54215c04ca75: Start checkup for virtual disk test22
2016-09-09 10:20:39 10100 +0200 - ovs-node2 - 973/139711040272192 - lib/mds - 7 - DEBUG - MDS safety: vDisk 45b1bae8-f8c1-4001-a4e9-54215c04ca75: Reconfiguration required. Reasons:
2016-09-09 10:20:39 10100 +0200 - ovs-node2 - 973/139711040272192 - lib/mds - 8 - DEBUG - MDS safety: vDisk 45b1bae8-f8c1-4001-a4e9-54215c04ca75:    * Not enough safety
2016-09-09 10:20:39 10100 +0200 - ovs-node2 - 973/139711040272192 - lib/mds - 9 - DEBUG - MDS safety: vDisk 45b1bae8-f8c1-4001-a4e9-54215c04ca75:    * Not enough services in use in primary domain
2016-09-09 10:20:39 11100 +0200 - ovs-node2 - 973/139711040272192 - extensions/sshclient - 10 - ERROR - StorageRouter 10.100.199.151 process heartbeat > 300s
2016-09-09 10:20:39 11100 +0200 - ovs-node2 - 973/139711040272192 - lib/mds - 11 - DEBUG - MDS safety: vDisk 45b1bae8-f8c1-4001-a4e9-54215c04ca75: Skipping storagerouter with IP 10.100.199.152 as it is unreachable
2016-09-09 10:20:40 92300 +0200 - ovs-node2 - 973/139711040272192 - lib/mds - 12 - DEBUG - MDS safety: vDisk 45b1bae8-f8c1-4001-a4e9-54215c04ca75: Completed
2016-09-09 10:20:47 02400 +0200 - ovs-node2 - 973/139711040272192 - lib/vdisk - 13 - INFO - Setting metadata pagecache size for vdisk test22 to 5120
Out[18]: 
{'backingdevice': '/test22.raw',
 'name': 'test22',
 'vdisk_guid': '45b1bae8-f8c1-4001-a4e9-54215c04ca75'}

Summary

Cloning is works on GUI but the GUI can hang sometimes (front-end issue, not relevant with the cloning part) Cloning works with API

Test result

Test passed.

Setup

Hyperconverged setup with 3 nodes

Package

hofkensj commented 7 years ago

In which release will this be fixed ?

wimpers commented 7 years ago

openvstorage 2.7.3 (it is already fixed on unstable)