Tendrl / monitoring-integration

Component that enables integration with external monitoring services.
GNU Lesser General Public License v2.1
4 stars 13 forks source link

RFE:volume/brick/Host status values are confusing #199

Open nthomas-redhat opened 7 years ago

nthomas-redhat commented 7 years ago

Currently the status displayed on the dashboard(cluster,volume,host,brick etc) are numbers which are meaningless. Instead map it to approprate string.

fbalak commented 7 years ago

It seems also incorrect. If 0 means Down and 1 means Up then there should be values 1 for all hosts and volume but I see 0 for everything. 1016_status

# gluster volume info

Volume Name: volume_alpha_distrep_6x2
Type: Distributed-Replicate
Volume ID: d292bc1b-e826-4e63-8d98-72df15cbb6c5
Status: Started
...
# gluster peer status
Number of Peers: 5

Hostname: fbalak-usm1-gl1
Uuid: 272c72ed-c621-4175-bb50-ff6e66563c56
State: Peer in Cluster (Connected)

Hostname: fbalak-usm1-gl2
Uuid: bd38197f-9117-4814-8fe7-a447d31d86d6
State: Peer in Cluster (Connected)

Hostname: fbalak-usm1-gl3
Uuid: e422eba8-ae51-44b6-a409-619847c358a9
State: Peer in Cluster (Connected)

Hostname: fbalak-usm1-gl4
Uuid: 81d88f7a-f141-4ab4-b279-d954086f625f
State: Peer in Cluster (Connected)

Hostname: fbalak-usm1-gl5
Uuid: 3b23daee-b902-4ae6-8380-e4443373eba0
State: Peer in Cluster (Connected)

Tested with:

tendrl-grafana-selinux-1.5.3-20171013T090621.ffb1b7f.noarch
tendrl-selinux-1.5.3-20171013T090621.ffb1b7f.noarch
tendrl-node-agent-1.5.3-20171013T081912.cdf0d6e.noarch
tendrl-notifier-1.5.3-20171011T200310.3c01717.noarch
tendrl-grafana-plugins-1.5.3-20171016T063749.da05bfa.noarch
tendrl-monitoring-integration-1.5.3-20171016T063749.da05bfa.noarch
tendrl-api-httpd-1.5.3-20171013T082716.a2f3b3f.noarch
tendrl-ansible-1.5.3-20171016T090020.d9e5914.noarch
tendrl-commons-1.5.3-20171013T081843.c73101a.noarch
tendrl-api-1.5.3-20171013T082716.a2f3b3f.noarch
tendrl-ui-1.5.3-20171013T082611.6e08356.noarch
glusterfs-4.0dev-0.209.git067f380.el7.centos.x86_64
julienlim commented 6 years ago

Currently, the Impacted Status Table panels that display this behavior include the following panels:

The descriptions are helpful (you click on the "i" in the panel's title to view it) but it's not easy to see and may get overlooked by users. Reason for not being able to fix this to improve usability as there's a grafana dependency (on the feature getting implemented).

This status information is also available in the Tendrl UI but requires user to jump back and forth to see the information, and/or learning curve.

In the cluster dashboard example below, the volume is degraded, and there's 1 brick down (from the top panels), and if user needed to figure out which volume and which brick is impacted, user would scroll down to see this information (or look in the Tendrl UI):

screen shot 2018-03-22 at 12 49 55 pm

Again, we see a similar behavior with the volume dashboard.

screen shot 2018-03-22 at 12 49 55 pm

It's a bit harder to make that similar connection in the host dashboard unless you see something wrong (red or orange):

screen shot 2018-03-22 at 12 54 31 pm

The status tables still have value, but for them to be more usable, having the status table more closely positioned next to the higher level panel (like in the Host Dashboard where the Bricks panel is right next to the Brick Status panel would help).

So, my suggestion for now is to keep the status tables, and we need to add it to the Documentation / Release Notes to explain it better.

@nthomas-redhat @cloudbehl @Tendrl/qe @jjkabrown1 @r0h4n @mcarrano