nicolargo / glances

Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
http://nicolargo.github.io/glances/
Other
26.94k stars 1.53k forks source link

ZFS Monitoring #873

Open hndrewaall opened 8 years ago

hndrewaall commented 8 years ago

It would be great to have monitoring/alerting for ZFS pools. Set alerts for degraded state, watch scrub/repair status, etc.

nicolargo commented 8 years ago

Testbed for Linux user (or / for developer without a ZFS pool):

$ dd if=/dev/zero of=/tmp/file1 count=100000 bs=1024
$ sudo zpool create zsfpool /tmp/file1

$ zpool list
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zsfpool    80M   108K  79.9M        -         -     3%     0%  1.00x    ONLINE  -

$ zpool list -v
NAME                          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zsfpool                        80M   108K  79.9M        -         -     3%     0%  1.00x    ONLINE  -
  /home/nicolargo/tmp/file1  97.5M   108K  79.9M        -         -     3%  0.13%      -    ONLINE

$ zpool iostat 
              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zsfpool      108K  79.9M      0      1    151  10.7K

$ zpool iostat -v
                               capacity     operations     bandwidth 
pool                         alloc   free   read  write   read  write
---------------------------  -----  -----  -----  -----  -----  -----
zsfpool                       108K  79.9M      0      1    140  9.99K
  /home/nicolargo/tmp/file1   108K  79.9M      0      1    140  9.99K
---------------------------  -----  -----  -----  -----  -----  -----

$ zpool status
  pool: zsfpool
 state: ONLINE
config:

    NAME                         STATE     READ WRITE CKSUM
    zsfpool                      ONLINE       0     0     0
      /home/nicolargo/tmp/file1  ONLINE       0     0     0

errors: No known data errors

Also be sure to uncomment the following line in the Glances conf file:

[fs]
allow=zfs
nicolargo commented 8 years ago

First issue, only root could grab ZFS pools status:

$ zpool status zsfpool
connect: Permission non accordée
internal error: failed to initialize ZFS library
$ sudo zpool status zsfpool
  pool: zsfpool
 state: ONLINE
 scrub: none requested
config:

    NAME        STATE     READ WRITE CKSUM
    zsfpool     ONLINE       0     0     0
      /file1    ONLINE       0     0     0

errors: No known data errors
hndrewaall commented 7 years ago

There are other examples of data that require root (sensor info) no?

nicolargo commented 7 years ago

Nope, Sensors did not need root rights...

One workaround is to configure the sudoers file to not ask for password when the 'sudo zpool status zsfpool' command line is asked. Not a very big fan...

bjornstromberg commented 6 years ago

decent workaround on this one is sudoers.d conf file with no password on just that command zpool status

anyway i just found out about this project and from a quick look it seems like a keeper..

travnewmatic commented 6 years ago

Salutations!

I've managed to edit my /etc/sudoers.d/zfs to allow sudo zpool status without requiring a password, but I'm still not sure how to verify that glances is able to do what it needs.

Should I see a thing saying my pool is OK if there's nothing wrong? Or will I see a thing only if something is wrong? I have set allow=zfs in my /etc/glances/glances.conf

Thanks for such an awesome project!

-Travis

kr4z33 commented 4 years ago

Hello, I am running a recent linux mint, and setting up ZFS. I notice I am able to run zpool status and other zfs enumeration commands as a non-root user - does this imply that glances can now have opportunity to better monitor ZFS status?

senorsmile commented 4 years ago

I confirm that on ubuntu 20.04 (out of the box) zpool and zfs commands can be run as a regular user.

fusionstream commented 4 years ago

Is there a follow up to this?

nicolargo commented 4 years ago

Ok, tested on Ubuntu 20.04 and the zpool status zsfpool command line could be executed as a regular user.

I need to understand what kind of additional information (specifics to ZFS) do you want to display in Glances.

For the moment the pool is displayed as a standard mount mount:

Screenshot from 2020-08-22 12-14-22

fusionstream commented 4 years ago

Perhaps for me the most critical would be: on a per pool basis,

then perhaps cool informational stuff would be:

fusionstream commented 4 years ago

zpool list also has some cool informational stuff in a single line per pool which covers point 2 and 5 of the informational list above but critically it doesn't tell you if dedup is on or off (I have a 1.00x dedup ratio because dedup is off). It also has info for FRAG and general health status

nicolargo commented 4 years ago

Thanks @fusionstream It is a lot of information for the space available in the sidebar... We need to make one mockup with all states.

fusionstream commented 4 years ago

no problem. I'm using it solely in Home Assistant at this time so full disclosure the space issue is something I will not yet experience fully. What's a dome mockup and how can I help?

nicolargo commented 4 years ago

@fusionstream can you make a mockup using a basic text editor ?

kr4z33 commented 4 years ago

I'll take a stab at it, the following will be incomplete, but perhaps a useful start for discussion:

FILE SYS      Used  Total
_ocker/aufs   182G   227G
-BEGIN ZFS MOCK-UP STUFF-
Zpool   _truncatezpoolnam

That last line would be the zpool name (truncated as required to fit) the following would be repeated for each zpool on the system.

ONE of the following lines containing status states for the pool would follow, the xxxx's would replaced with the actual numbers, typically here.

ONLINE       xxxxG  xxxxG
DEGRADED     xxxxG  xxxxG
SUSPENDED    xxxxG  xxxxG
FAULTED
UNAVAILABLE
OFFLINE

the zpool may be in various states of scrubbing or resilvering, one of the following groups would follow

SCRUB RUNNING      __.__%
REPAIRED            xxxxB

SCRUB COMPLETE--no errors
yy-mm-ddThh:mm:ss

SCRUB COMPLETE --- errors
yy-mm-ddThh:mm:ss
REPAIRED            xxxxB

RESILVER COMPLETE--No Err
yy-mm-ddThh:mm:ss

RESILVER COMPLETE--errors
yy-mm-ddThh:mm:ss
REPAIRED            xxxxB

RESILVER RUNNING   __.__%
REPAIRED            xxxxB

following this would be the configuration of the zpool, this can be presented in several ways, there should be some mechanism to toggle among the options here, or others. The following samples assume a pool of two mirrored VDEVs some thought would be required to accommodate other configurations and vdev types (logs, RAIDZ, etc) I am less familiar with those, so I am going to keep the scope limited in this mock-up. With large pools this may consume vertical real estate, how to handle that is a problem for later.

Another command available to non-root is the following: zpool iostat -v some of this would come from there, and could be refreshed for a live view of the movings of data which would be of interest.

in these samples "_xxx..."" is a truncated drive name. these may be in the form of sda, sdb, etc or longer disk ID strings that will need truncation to fit.

"nnn" is a number

with no errors, capacity may be of primary concern for someone mucking about in configuring a zpool:

config_cap.  alloc   free
  Mirror-0   xxxxG  xxxxG
    _xxxxxx      -      -
    _xxxxxx      -      -
   Mirror-1  xxxxG  xxxxG
    _xxxxxx      -      -
    _xxxxxx      -      - 

for someone keeping track of a pool in production might be interested to see IO performance, presented like the following, displaying operations:

config_ops.   read  write
  Mirror-0     nnn    nnn
    _xxxxxxxx  nnn    nnn
    _xxxxxxxx  nnn    nnn
   Mirror-1    nnn    nnn
    _xxxxxxxx  nnn    nnn
    _xxxxxxxx  nnn    nnn

or the following as bandwidth:

config_BW.    read  write
  Mirror-0     nnn    nnn
    _xxxxxxxx  nnn    nnn
    _xxxxxxxx  nnn    nnn
   Mirror-1    nnn    nnn
    _xxxxxxxx  nnn    nnn
    _xxxxxxxx  nnn    nnn 

If the pool status is not "ONLINE" config with states would likely be of interest:

Config              State        
mirror-0
  _xxxxxxxx   UNAVAILABLE
  _xxxxxxxx   UNAVAILABLE
mirror-n
  _xxxxxxxx
  replacing      DEGRADED
    _xxxxxx       OFFLINE
    _xxxxxx        ONLINE

or the following error counts on Read/Write/Checksum

Config_Err.   R   W   CHK    
  Mirror-0   nn  nn   nnn  
    _xxxxxx  nn  nn   nnn
    _xxxxxx  nn  nn   nnn
   Mirror-1  nn  nn   nnn
    _xxxxxx  nn  nn   nnn
    _xxxxxx  nn  nn   nnn

Following the display of one of the above configuration presentations, zfs file systems would be displayed. each one may or may not be mounted, which might be indicated by a color change or something. This information comes from the zfs list command the 'AVAIL' part of that output is not of interest here, because it would be displayed at the pool level. the filesystem names will need to be truncated to fit:

FILE SYS      Used  Refer
_zfsFileSys  xxxxG  xxxxG
_zfsFileSys  xxxxG  xxxxG
_zfsFileSys  xxxxG  xxxxG
...
...
nicolargo commented 2 years ago

Postpone because the information needed could not be integrated in the current FS Plugin. The proposal is to stay with the current feature in Glances v3.x:

The result is the following in the current develop branch:

image

In Glances version 4 a dedicated plugin should be created (see branch https://github.com/nicolargo/glances/tree/glancesv4).

Contributors are welcome.

nicolargo commented 1 week ago

For contributors: have a look on https://pypi.org/project/zpool-status/