gluster / gluster-prometheus

Gluster monitoring using Prometheus
GNU Lesser General Public License v2.1
119 stars 70 forks source link

Inconsistent data for bricks metrics #175

Open VLZZZ opened 4 years ago

VLZZZ commented 4 years ago

Problem

GlusterFS volume with 6 bricks (Replicated on 3 nodes). After resizing the volume using heketi-cli for each node additional brick was added with different size. Anyway, when exporting metrics there's duplicated brick_pids and no metrics for bunch of bricks in volume.

I though that it's the same problem described in #168 but we use the version with hotfix already.

Information

gluster volume info vol_1f6bbd7c2d17834cabc5641414de13d1:

Volume Name: vol_1f6bbd7c2d17834cabc5641414de13d1
Type: Distributed-Replicate
Volume ID: 3f1f97e1-b3d4-4170-a4ac-772c3dbc62a1
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.134.250:/var/lib/heketi/mounts/vg_0671e452b8dcd045a69f029827ab1fa2/brick_4f4a02fa038f0457c60b44b7460fe8fa/brick
Brick2: 10.70.134.195:/var/lib/heketi/mounts/vg_ce7255e7dd1922ab0ceff493c4451e4f/brick_a7ae30e683d839bf9f5f4e990f97c62c/brick
Brick3: 10.70.134.254:/var/lib/heketi/mounts/vg_f1a0d56f912d9557548602568aa0323e/brick_f42dbabd011649c8db2af17c34ba1309/brick
Brick4: 10.70.134.195:/var/lib/heketi/mounts/vg_ce7255e7dd1922ab0ceff493c4451e4f/brick_ac6f7d0f3de2d7c80ffaa9023c55375f/brick
Brick5: 10.70.134.254:/var/lib/heketi/mounts/vg_f1a0d56f912d9557548602568aa0323e/brick_36c97c15e0476981d1553345a962a16b/brick
Brick6: 10.70.134.250:/var/lib/heketi/mounts/vg_0671e452b8dcd045a69f029827ab1fa2/brick_f1c14b147b8c26dceef726a87a2d7c11/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

Note 6 different bricks pids in the status output:

gluster volume status vol_1f6bbd7c2d17834cabc5641414de13d1:

Status of volume: vol_1f6bbd7c2d17834cabc5641414de13d1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.134.250:/var/lib/heketi/mounts/
vg_0671e452b8dcd045a69f029827ab1fa2/brick_4
f4a02fa038f0457c60b44b7460fe8fa/brick       49156     0          Y       266
Brick 10.70.134.195:/var/lib/heketi/mounts/
vg_ce7255e7dd1922ab0ceff493c4451e4f/brick_a
7ae30e683d839bf9f5f4e990f97c62c/brick       49172     0          Y       264
Brick 10.70.134.254:/var/lib/heketi/mounts/
vg_f1a0d56f912d9557548602568aa0323e/brick_f
42dbabd011649c8db2af17c34ba1309/brick       49156     0          Y       267
Brick 10.70.134.195:/var/lib/heketi/mounts/
vg_ce7255e7dd1922ab0ceff493c4451e4f/brick_a
c6f7d0f3de2d7c80ffaa9023c55375f/brick       49173     0          Y       273
Brick 10.70.134.254:/var/lib/heketi/mounts/
vg_f1a0d56f912d9557548602568aa0323e/brick_3
6c97c15e0476981d1553345a962a16b/brick       49157     0          Y       276
Brick 10.70.134.250:/var/lib/heketi/mounts/
vg_0671e452b8dcd045a69f029827ab1fa2/brick_f
1c14b147b8c26dceef726a87a2d7c11/brick       49157     0          Y       273
Self-heal Daemon on localhost               N/A       N/A        Y       196
Self-heal Daemon on 10.70.134.195           N/A       N/A        Y       193
Self-heal Daemon on 10.70.134.250           N/A       N/A        Y       193

Task Status of Volume vol_1f6bbd7c2d17834cabc5641414de13d1
------------------------------------------------------------------------------
Task                 : Rebalance
ID                   : 398d867f-dcbd-4bc6-8c91-bf1eb46b84dd
Status               : completed

gluster volume status vol_1f6bbd7c2d17834cabc5641414de13d1 detail:

Status of volume: vol_1f6bbd7c2d17834cabc5641414de13d1
------------------------------------------------------------------------------
Brick                : Brick 10.70.134.250:/var/lib/heketi/mounts/vg_0671e452b8dcd045a69f029827ab1fa2/brick_4f4a02fa038f0457c60b44b7460fe8fa/brick
TCP Port             : 49156
RDMA Port            : 0
Online               : Y
Pid                  : 266
File System          : xfs
Device               : /dev/mapper/vg_0671e452b8dcd045a69f029827ab1fa2-brick_4f4a02fa038f0457c60b44b7460fe8fa
Mount Options        : rw,noatime,nouuid,attr2,inode64,sunit=64,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 8.5GB
Total Disk Space     : 22.0GB
Inode Count          : 11534080
Free Inodes          : 11493473
------------------------------------------------------------------------------
Brick                : Brick 10.70.134.195:/var/lib/heketi/mounts/vg_ce7255e7dd1922ab0ceff493c4451e4f/brick_a7ae30e683d839bf9f5f4e990f97c62c/brick
TCP Port             : 49172
RDMA Port            : 0
Online               : Y
Pid                  : 264
File System          : xfs
Device               : /dev/mapper/vg_ce7255e7dd1922ab0ceff493c4451e4f-brick_a7ae30e683d839bf9f5f4e990f97c62c
Mount Options        : rw,noatime,nouuid,attr2,inode64,sunit=64,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 9.0GB
Total Disk Space     : 22.0GB
Inode Count          : 11534080
Free Inodes          : 11493398
------------------------------------------------------------------------------
Brick                : Brick 10.70.134.254:/var/lib/heketi/mounts/vg_f1a0d56f912d9557548602568aa0323e/brick_f42dbabd011649c8db2af17c34ba1309/brick
TCP Port             : 49156
RDMA Port            : 0
Online               : Y
Pid                  : 267
File System          : xfs
Device               : /dev/mapper/vg_f1a0d56f912d9557548602568aa0323e-brick_f42dbabd011649c8db2af17c34ba1309
Mount Options        : rw,noatime,nouuid,attr2,inode64,sunit=64,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 8.5GB
Total Disk Space     : 22.0GB
Inode Count          : 11534080
Free Inodes          : 11493397
------------------------------------------------------------------------------
Brick                : Brick 10.70.134.195:/var/lib/heketi/mounts/vg_ce7255e7dd1922ab0ceff493c4451e4f/brick_ac6f7d0f3de2d7c80ffaa9023c55375f/brick
TCP Port             : 49173
RDMA Port            : 0
Online               : Y
Pid                  : 273
File System          : xfs
Device               : /dev/mapper/vg_ce7255e7dd1922ab0ceff493c4451e4f-brick_ac6f7d0f3de2d7c80ffaa9023c55375f
Mount Options        : rw,noatime,nouuid,attr2,inode64,sunit=64,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 470.1MB
Total Disk Space     : 5.0GB
Inode Count          : 970368
Free Inodes          : 962997
------------------------------------------------------------------------------
Brick                : Brick 10.70.134.254:/var/lib/heketi/mounts/vg_f1a0d56f912d9557548602568aa0323e/brick_36c97c15e0476981d1553345a962a16b/brick
TCP Port             : 49157
RDMA Port            : 0
Online               : Y
Pid                  : 276
File System          : xfs
Device               : /dev/mapper/vg_f1a0d56f912d9557548602568aa0323e-brick_36c97c15e0476981d1553345a962a16b
Mount Options        : rw,noatime,nouuid,attr2,inode64,sunit=64,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 470.1MB
Total Disk Space     : 5.0GB
Inode Count          : 970376
Free Inodes          : 963005
------------------------------------------------------------------------------
Brick                : Brick 10.70.134.250:/var/lib/heketi/mounts/vg_0671e452b8dcd045a69f029827ab1fa2/brick_f1c14b147b8c26dceef726a87a2d7c11/brick
TCP Port             : 49157
RDMA Port            : 0
Online               : Y
Pid                  : 273
File System          : xfs
Device               : /dev/mapper/vg_0671e452b8dcd045a69f029827ab1fa2-brick_f1c14b147b8c26dceef726a87a2d7c11
Mount Options        : rw,noatime,nouuid,attr2,inode64,sunit=64,swidth=512,noquota
Inode Size           : 512
Disk Space Free      : 470.1MB
Total Disk Space     : 5.0GB
Inode Count          : 970360
Free Inodes          : 962989

Then when I'm trying to receive metrics the only metric with consistent data is gluster_volume_status_brick_count which is showing 6 bricks

gluster_volume_brick_pid{hostname="10.70.134.195",instance="localhost",peerid="3f6b8a82-8988-42d7-a37c-5cf567a89eab",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 273
gluster_volume_brick_pid{hostname="10.70.134.250",instance="localhost",peerid="d05b7c42-fd49-4a33-9a06-4a66e4079452",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 273
gluster_volume_brick_pid{hostname="10.70.134.254",instance="localhost",peerid="07e29ad9-1247-42b2-9db8-512f7baaccce",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 276
gluster_volume_brick_total_bytes{hostname="10.70.134.195",instance="localhost",peerid="3f6b8a82-8988-42d7-a37c-5cf567a89eab",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 5.357961216e+09
gluster_volume_brick_total_bytes{hostname="10.70.134.250",instance="localhost",peerid="d05b7c42-fd49-4a33-9a06-4a66e4079452",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 5.357961216e+09
gluster_volume_brick_total_bytes{hostname="10.70.134.254",instance="localhost",peerid="07e29ad9-1247-42b2-9db8-512f7baaccce",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 5.357961216e+09
gluster_volume_brick_total_inodes{hostname="10.70.134.195",instance="localhost",peerid="3f6b8a82-8988-42d7-a37c-5cf567a89eab",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 970368
gluster_volume_brick_total_inodes{hostname="10.70.134.250",instance="localhost",peerid="d05b7c42-fd49-4a33-9a06-4a66e4079452",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 970360
gluster_volume_brick_total_inodes{hostname="10.70.134.254",instance="localhost",peerid="07e29ad9-1247-42b2-9db8-512f7baaccce",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 970376
gluster_volume_status_brick_count{instance="localhost",volume_name="vol_1f6bbd7c2d17834cabc5641414de13d1"} 6