incorrect number of bricks selection on volume creation.

naveenmh commented 7 years ago

System information:

Type	Version/Name
Distribution Name	"CentOS Linux"
Distribution Version	"6.6 (Core)"
Linux Kernel	"2.6.32-504.el6.x86_64"
Architecture	"x86_64"
Integralstor Version	"master" commit id : "https://github.com/integralstor/integralstor_gridcell/commit/5dfc440df6f16a08bf6426c1b02a73bcf8d60f30"

Describe the problem you're observing

On volume creation the number of bricks selected is not ok.
For distributed and replicated volume when we have 4 bricks and try creating volume it selects only 3 bricks and 2bricks respectively.

Describe the expected behaviour

Should volume creation takes all existing bricks.

Describe how to reproduce the problem

Create a gridcell setup with initial 3bricks.
Add one more brick to the existing gridcell and try creating the volumes.

Include any warning/errors/backtraces from the system logs:

Error page/screens

CLI :

distributed volume : [root@grid1 ~]# gluster volume info

Volume Name: integralstor_admin_vol Type: Replicate Volume ID: 74a560d0-9c79-42d7-b0d7-21ee8f5baf74 Status: Started Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: grid2.integralstor.lan:/frzpool/normal/integralstor_admin_vol Brick2: grid3.integralstor.lan:/frzpool/normal/integralstor_admin_vol Brick3: grid1.integralstor.lan:/frzpool/normal/integralstor_admin_vol (arbiter) Options Reconfigured: performance.readdir-ahead: on

Volume Name: vol1_dist Type: Distribute Volume ID: 557b13c9-4c50-4c2e-ace0-336858336f78 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: grid2.integralstor.lan:/frzpool/normal/vol1_dist Brick2: grid3.integralstor.lan:/frzpool/normal/vol1_dist Brick3: grid1.integralstor.lan:/frzpool/normal/vol1_dist Options Reconfigured: storage.owner-gid: 1000 performance.readdir-ahead: on

replicated volume : Volume Name: vol2_repl Type: Replicate Volume ID: 16a1dd08-3757-4652-8083-bfac4c71d1b7 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: grid2.integralstor.lan:/frzpool/normal/vol2_repl Brick2: grid3.integralstor.lan:/frzpool/normal/vol2_repl Options Reconfigured: storage.owner-gid: 1000 performance.readdir-ahead: on

System/Gridcell info :

gluster peer status : [root@grid1 ~]# gluster peer status Number of Peers: 3

Hostname: grid2.integralstor.lan Uuid: 3b0f9417-58b4-4ed9-9fdc-4ef19a5c0859 State: Peer in Cluster (Connected)

Hostname: grid3.integralstor.lan Uuid: 961ec0d3-4962-4b7d-b822-2d83ccdd5dee State: Peer in Cluster (Connected)

Hostname: grid4.integralstor.lan Uuid: 5eebded7-78f7-4a7b-b87f-289c48cb690d State: Peer in Cluster (Connected)

ctdb status : [root@grid1 ~]# ctdb status Number of nodes:4 pnn:0 192.168.1.202 OK pnn:1 192.168.1.203 OK pnn:2 192.168.1.201 OK (THIS NODE) pnn:3 192.168.1.204 OK Generation:124192121 Size:4 hash:0 lmaster:0 hash:1 lmaster:1 hash:2 lmaster:2 hash:3 lmaster:3 Recovery mode:NORMAL (0) Recovery master:2

scareenshot : ell_volume_creating_only_on_2_gridcells3 ell_volume_creating_only_on_3_gridcells2

fractalram commented 7 years ago

This is expected behaviour as it tries to do a replica 2 over existing bricks and therefore takes the maximum multiple of 2 which is less than the available number of nodes.

naveenmh commented 7 years ago

For replicated volume its correct. But, When we have 4 bricks, it should consider all 4 available bricks for creating distributed volume?

fractalram commented 7 years ago

According to the output of the commands given above, when you tried it with 3 nodes, it did create a volume with 3 bricks. Did you try it again with 4 healthy nodes? What did you get?

naveenmh commented 7 years ago

Will try with all 4 healthy nodes and try reproduce the same may be that time 4th node was not responding for salt calls during volume creation.

integralstor / integralstor_gridcell