olopez32 / ganeti

Automatically exported from code.google.com/p/ganeti
0 stars 0 forks source link

hroller (& hbal) treat ext/rbd disk_templates as "non-redundant" #950

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
> What software version are you running? Please provide the output of "gnt-
> cluster --version", "gnt-cluster version", and "hspace --version".

$ gnt-cluster --version
gnt-cluster (ganeti v2.10.3-27-gfcf69d1) 2.10.7
$ gnt-cluster version
Software version: 2.10.7
Internode protocol: 2100000
Configuration format: 2100000
OS api version: 20
Export interface: 0
VCS version: (ganeti) version v2.10.3-27-gfcf69d1

$ hspace --version
compiled with ghc 7.4
running on linux x86_64

(we also have synnefo installed and have incorporated a few backported commits 
from a later Ganeti version, but our devs assure me their code doesn't make any 
changes to Ganeti's version of hroller and hbal - which are the problem tools 
described below).

> What distribution are you using?

Debian 7.6

> What steps will reproduce the problem?

1. hroller --skip-non-redundant -G default -L (skips nodes with "ext" or "rbd" 
instances on them, regardless whether they are "redundant" or "non-redundant")

or

1. hbal --no-disk-moves -G default -L (doesn't output migrations for "ext" or 
"rbd" templates)

> What is the expected output? What do you see instead?

This is probably as much about a design-decision as a "bugfix". If I'm not 
mistaken there may be conflation of orthogonal concepts. Strictly speaking both 
ext and rbd can be "redundant" or "non-redundant", but in either case they are 
"purely external storage" (or at least "storage not managed by Ganeti"), and 
therefore only need "migration", not "moving", yet hroller and hbal treat them 
as "non-redundant". There seem to be three problems:

 1) hbal thinks ext/rbd are "movable" (--no-disk-moves), not "migratable", which is wrong

 2) it is impossible for the tools to know whether ext/rbd are "redundant" or not anyway due to the fact that Ganeti doesn't manage them - so the redundant/non-redundant semantics gets very confusing/misleading when dealing with them

 3) it would be (more?) useful for hroller to have a flag which distinguishes between an instance being in either of the following two circumstances:
   - "having primary-local data"
   - "having only external, secondary-local, or no data"
so the flag would mean "skip/ignore any instance that will freeze/crash if I 
reboot the node" (or "skip/ignore whatever needs moving instead of migrating"). 
This would sanely categorise drbd/ext/rbd/blockdev/sharedfile/diskless in the 
second group, and file/plain in the first group. Perhaps 
"--ignore-local-primary" and "--skip-local-primary"..? Whether such a flag 
would be in addition to --{ignore,skip}-non-redundant or would replace it is 
obviously a design-decision.

Original issue reported on code.google.com by ro...@rowanthorpe.com on 19 Sep 2014 at 5:04

GoogleCodeExporter commented 9 years ago
(3) Would be feasible now - add an additional flag, if it turns out to be 
urgent.

Parts (1) and (2) need to be discussed with the upcoming design changes for 
disk as separate entities. 

Original comment by pud...@google.com on 27 May 2015 at 1:46