HewlettPackard / lustre_exporter

Prometheus exporter for use with the Lustre parallel filesystem
Apache License 2.0
36 stars 51 forks source link

/proc/fs/lustre/sptlrpc/encrypt_page_pools stats #68

Closed knweiss closed 7 years ago

knweiss commented 7 years ago

There's another missing stats file used for Lustre's GSS/Kerberos support:

# cat /proc/fs/lustre/sptlrpc/encrypt_page_pools
physical pages:          16465072
pages per pool:          512
max pages:               2058134
max pools:               4020
total pages:             0
total free:              0
idle index:              100/100
last shrink:             1015775s
last access:             1015775s
max pages reached:       0
grows:                   0
grows failure:           0
shrinks:                 0
cache access:            0
cache missing:           0
low free mark:           0
max waitqueue depth:     0
max wait time:           0/1000
out of mem:             0

The implementation is in lustre-release.git/lustre/ptlrpc/sec_bulk.c:

int sptlrpc_proc_enc_pool_seq_show(struct seq_file *m, void *v)
{
[...]
        seq_printf(m, "physical pages:          %lu\n"
                   "pages per pool:          %lu\n"
                   "max pages:               %lu\n"
                   "max pools:               %u\n"
                   "total pages:             %lu\n"
                   "total free:              %lu\n"
                   "idle index:              %lu/100\n"
                   "last shrink:             %lds\n"
                   "last access:             %lds\n"
                   "max pages reached:       %lu\n"
                   "grows:                   %u\n"
                   "grows failure:           %u\n"
                   "shrinks:                 %u\n"
                   "cache access:            %lu\n"
                   "cache missing:           %lu\n"
                   "low free mark:           %lu\n"
                   "max waitqueue depth:     %u\n"
                   "max wait time:           "CFS_TIME_T"/%lu\n"
                   "out of mem:             %lu\n",
                   totalram_pages, PAGES_PER_POOL,
                   page_pools.epp_max_pages,
                   page_pools.epp_max_pools,
                   page_pools.epp_total_pages,
                   page_pools.epp_free_pages,
                   page_pools.epp_idle_idx,
                   (long)(ktime_get_seconds() - page_pools.epp_last_shrink),
                   (long)(ktime_get_seconds() - page_pools.epp_last_access),
                   page_pools.epp_st_max_pages,
                   page_pools.epp_st_grows,
                   page_pools.epp_st_grow_fails,
                   page_pools.epp_st_shrinks,
                   page_pools.epp_st_access,
                   page_pools.epp_st_missings,
                   page_pools.epp_st_lowfree,
                   page_pools.epp_st_max_wqlen,
                   page_pools.epp_st_max_wait,
                   msecs_to_jiffies(MSEC_PER_SEC),
                   page_pools.epp_st_outofmem);
[...]
}
joehandzik commented 7 years ago

@knweiss Are these statistics that you find useful in your use of Lustre? We're trying to prioritize across a few other targeted sources (client data and lnet data being the big two we had in mind). If this is a high-priority target, we'll prioritize it as such.

knweiss commented 7 years ago

Right now, it's low priority.

However, a colleague wants to enable the GSS/Kerberos support in his latest Lustre installation and was interested in these metrics when I asked him today.

roclark commented 7 years ago

Once I finish all of my current refactors, I will get started on this. If you have any information on these metrics (specifically, something I can use for help text), that will speed up the process for me.