olopez32 / ganeti

Automatically exported from code.google.com/p/ganeti
0 stars 0 forks source link

Failed replace-disks when disks are removed and recreated with different parameters #474

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I run into a peculiar problem (and given the peculiar steps to reproduce it, I 
wouldn't assign a high priority to it). This are the steps to reproduce it:

# From the master node:
gnt-cluster modify --disk-parameters drbd:data-stripes=3,meta-stripes=3
gnt-instance add -o busybox -t drbd -s 512m -n NODE1:NODE2 test-isntance
# A data volume of 516 MB and a meta volume of 132 MB get created on both nodes
gnt-cluster modify --disk-parameters drbd:data-stripes=2,meta-stripes=2
# The different striping would cause the disks above to get the exact requested 
size (512 MB and 128 MB respectively)

# From NODE2:
drbdsetup 0 down
lvremove /dev/xenvg/DISK_UUID.disk0_*

# Back from the master:
gnt-instance activate-disks test-instance
gnt-instance replace-disks -s test-instance

The last command fails with this output:

Thu May 23 19:09:02 2013 Replacing disk(s) 0 for instance 'test-instance'
Thu May 23 19:09:02 2013 Current primary node: NODE1
Thu May 23 19:09:02 2013 Current seconary node: NODE2
Thu May 23 19:09:02 2013 STEP 1/6 Check device existence
Thu May 23 19:09:02 2013  - INFO: Checking disk/0 on NODE1
Thu May 23 19:09:02 2013  - INFO: Checking disk/0 on NODE2
Thu May 23 19:09:02 2013  - INFO: Checking volume groups
Thu May 23 19:09:02 2013 STEP 2/6 Check peer consistency
Thu May 23 19:09:02 2013  - INFO: Checking disk/0 consistency on node NODE1
Thu May 23 19:09:02 2013 STEP 3/6 Allocate new storage
Thu May 23 19:09:02 2013  - INFO: Adding storage on NODE2 for disk/0
Thu May 23 19:09:03 2013 STEP 4/6 Changing drbd configuration
Thu May 23 19:09:03 2013  - INFO: Detaching disk/0 drbd from local storage
Thu May 23 19:09:03 2013  - INFO: Renaming the old LVs on the target node
Thu May 23 19:09:04 2013  - INFO: Renaming the new LVs on the target node
Thu May 23 19:09:04 2013  - INFO: Adding new mirror component on NODE2
Failure: command execution error:  
Can't add local storage to drbd: Error while executing backend function: drbd0: 
can't attach local disk: /dev/drbd0: Failure: (111) Low.dev. smaller than 
requested DRBD-dev. size.

From noded log:
2013-05-23 19:09:04,922: ganeti-noded pid=13807 process:118 DEBUG Command 
'drbdsetup /dev/drbd0 disk 
/dev/xenvg/e2bffb1c-8f05-4fd2-b976-5fdc01330bed.disk0_data 
/dev/xenvg/e2bffb1c-8f05-4fd2-b976-5fdc01330bed.disk0_meta 0 -e detach 
--create-device -d 512m --no-md-flushes --no-disk-flushes --no-disk-barrier' 
failed (exited with exit code 10); output: /dev/drbd0: Failure: (111) Low.dev. 
smaller than requested DRBD-dev. size.

2013-05-23 19:09:04,922: ganeti-noded pid=13807 base:371 ERROR drbd0: can't 
attach local disk: /dev/drbd0: Failure: (111) Low.dev. smaller than requested 
DRBD-dev. size.

2013-05-23 19:09:04,923: ganeti-noded pid=13807 noded:197 ERROR Error in RPC 
call
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/ganeti/server/noded.py", line 181, in HandleRequest
    result = (True, method(serializer.LoadJson(req.request_body)))
  File "/usr/lib/python2.6/dist-packages/ganeti/server/noded.py", line 286, in perspective_blockdev_addchildren
    return backend.BlockdevAddchildren(bdev, ndevs)
  File "/usr/lib/python2.6/dist-packages/ganeti/backend.py", line 2008, in BlockdevAddchildren
    parent_bdev.AddChildren(new_bdevs)
  File "/usr/lib/python2.6/dist-packages/ganeti/storage/drbd.py", line 482, in AddChildren
    self._AssembleLocal(self.minor, backend.dev_path, meta.dev_path, self.size)
  File "/usr/lib/python2.6/dist-packages/ganeti/storage/drbd.py", line 378, in _AssembleLocal
    minor, result.output)
  File "/usr/lib/python2.6/dist-packages/ganeti/storage/base.py", line 372, in ThrowError
    raise errors.BlockDeviceError(msg)
BlockDeviceError: drbd0: can't attach local disk: /dev/drbd0: Failure: (111) 
Low.dev. smaller than requested DRBD-dev. size.

Please notice that if the disks aren't manually removed the error doesn't 
manifest itself. This happened on the master branch (345d395d).

Original issue reported on code.google.com by bdals...@google.com on 23 May 2013 at 7:32

GoogleCodeExporter commented 9 years ago

Original comment by ultrot...@google.com on 24 May 2013 at 8:44

GoogleCodeExporter commented 9 years ago

Original comment by ultrot...@google.com on 21 Jun 2013 at 11:43

GoogleCodeExporter commented 9 years ago

Original comment by ultrot...@google.com on 21 Jun 2013 at 11:45

GoogleCodeExporter commented 9 years ago
Move non-critical bugs scheduled for 2.8 or 2.9 to 2.11, as in those versions 
only critical bug fixes will be integrated.

Original comment by thoma...@google.com on 30 Oct 2013 at 9:48