zfsonfreebsd / ZoF

ZFS on FreeBSD - the official out of tree OpenZFS implementation for FreeBSD
https://freebsd.org
Other
100 stars 8 forks source link

Slow zpool and lots of time spent in arc_recl #75

Closed kmoore134 closed 5 years ago

kmoore134 commented 5 years ago

After running ZoF overnight, the various periodic scripts have kicked off, one of which runs "find" across the entire FS. This morning, while the system is still usable, the local pool is very slow, and I see the find process spending a lot of time in arc_recl state. Last time this occurred, bumping up arc_max immediately sped up the system and find command again.

Here's as much relevant detail as I could find:

kstat.txt

# ps -auwwx | grep find
root 3425    0.0  0.0  14676  6564  -  D    03:01       1:33.49 / /var/tmp /var/log /tmp /usr/home /root /usr/obj /usr/ports /usr/local/poudriere/data /usr/src /usr/jails
/var/audit /var/mail /usr/local/poudriere/ports/freenas /usr/local/poudriere/ports/freenas-zof /usr/local/poudriere/jails/bports /usr/local/poudriere/jails/freenas-zof /usr/local/poudriere/jails/trueos-mk-base /usr/local/poudriere/jails/testjail /usr/local/poudriere/ports/trueos-mk-ports /usr/local/poudriere/jails/testpkg /usr/local/poudriere/data/images /usr/local/poudriere/data/wrkdirs /usr/local/poudriere/data/packages /usr/local/poudriere/data/.m /usr/local/poudriere/data/cache /usr/local/poudriere/data/logs /dev/null (find)
# procstat basic 3425                                                                                                                                      
  PID  PPID  PGID   SID  TSID THR LOGIN    WCHAN     EMUL          COMM
 3425  3421  3308  3308     0   1 root     arc_recl  FreeBSD ELF64 find
# procstat -k 3425                                                                                                                                         
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_prefetch dmu_zfetch dbuf_read dnode_hold_impl dmu_bonus_hold_impl zfs_zget zfs_dirent_lookup zfs_dirlook zfs_lookup zfs_freebsd_lookup VOP_CACHEDLOOKUP_APV                                      
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_prefetch dmu_zfetch dbuf_read dnode_hold_impl dmu_bonus_hold_impl zfs_zget zfs_dirent_lookup zfs_dirlook zfs_lookup zfs_freebsd_lookup VOP_CACHEDLOOKUP_APV                                      
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_read dmu_buf_hold zap_cursor_retrieve zfs_freebsd_readdir VOP_READDIR_APV kern_getdirentries sys_getdirentries amd64_syscall fast_syscall_common                                                 
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_buf_alloc_impl arc_read dbuf_read dnode_hold_impl dmu_bonus_hold_impl zfs_zget zfs_dirent_lookup zfs_dirlook zfs_lookup zfs_freebsd_lookup VOP_CACHEDLOOKUP_APV vfs_cache_lookup VOP_LOOKUP_APV lookup                                     
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_prefetch dmu_prefetch zfs_freebsd_readdir VOP_READDIR_APV kern_getdirentries sys_getdirentries amd64_syscall fast_syscall_common                                                                 
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_prefetch dmu_prefetch zfs_freebsd_readdir VOP_READDIR_APV kern_getdirentries sys_getdirentries amd64_syscall fast_syscall_common                                                                 
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_prefetch dmu_zfetch dbuf_read dnode_hold_impl dmu_bonus_hold_impl zfs_zget zfs_dirent_lookup zfs_dirlook zfs_lookup zfs_freebsd_lookup VOP_CACHEDLOOKUP_APV                                      
root@avenger:~ # procstat -k 3425
  PID    TID COMM                TDNAME              KSTACK
 3425 100822 find                -                   mi_switch sleepq_wait _cv_wait arc_get_data_impl arc_hdr_alloc_abd arc_hdr_alloc arc_read dbuf_prefetch dmu_zfetch dbuf_read dnode_hold_impl dmu_bonus_hold_impl zfs_zget zfs_dirent_lookup zfs_dirlook zfs_lookup zfs_freebsd_lookup VOP_CACHEDLOOKUP_APV
# procstat -t 3425                                                                                                                                         
  PID    TID COMM                TDNAME              CPU  PRI STATE   WCHAN
 3425 100822 find                -                    -1  120 sleep   arc_recl

Mav pointed out that the "2722M Other" is concerning. I have max_arc capped at 3000M in loader.conf.


last pid:  4597;  load averages:  1.08,  1.03,  1.00                                                                                               up 0+17:47:26  09:57:02
963 threads:   26 running, 895 sleeping, 42 waiting
CPU:  0.0% user,  0.0% nice,  4.1% system,  0.0% interrupt, 95.9% idle
Mem: 35M Active, 783M Inact, 10G Wired, 52G Free
ARC: 4229M Total, 1154M MFU, 327M MRU, 1961K Anon, 24M Header, 2722M Other
Swap: 2000M Total, 2000M Free

  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
   11 root        155 ki31      0   672K CPU4     4  17.7H 100.00% idle{idle: cpu4}
   11 root        155 ki31      0   672K CPU3     3  17.5H 100.00% idle{idle: cpu3}
   11 root        155 ki31      0   672K CPU6     6  17.5H 100.00% idle{idle: cpu6}
   11 root        155 ki31      0   672K CPU18   18  17.4H 100.00% idle{idle: cpu18}
   11 root        155 ki31      0   672K CPU5     5  17.4H 100.00% idle{idle: cpu5}
   11 root        155 ki31      0   672K CPU10   10  17.4H 100.00% idle{idle: cpu10}
   11 root        155 ki31      0   672K CPU11   11  17.3H 100.00% idle{idle: cpu11}
   11 root        155 ki31      0   672K CPU14   14  17.6H  99.85% idle{idle: cpu14}
   11 root        155 ki31      0   672K CPU12   12  17.7H  99.80% idle{idle: cpu12}
   11 root        155 ki31      0   672K CPU16   16  17.4H  99.78% idle{idle: cpu16}
   11 root        155 ki31      0   672K RUN     21  17.2H  99.67% idle{idle: cpu21}
   11 root        155 ki31      0   672K CPU1     1  17.6H  99.11% idle{idle: cpu1}
   11 root        155 ki31      0   672K CPU22   22  17.6H  98.73% idle{idle: cpu22}
   11 root        155 ki31      0   672K CPU20   20  17.5H  98.58% idle{idle: cpu20}
   11 root        155 ki31      0   672K CPU0     0  17.7H  98.47% idle{idle: cpu0}
   11 root        155 ki31      0   672K CPU8     8  17.4H  97.19% idle{idle: cpu8}
   11 root        155 ki31      0   672K CPU15   15  17.7H  95.18% idle{idle: cpu15}
   40 root         -8    -      0   224K CPU19   19 406:24  95.01% zfskern{arc_reclaim_thread}
mattmacy commented 5 years ago

arc metadata max isn't scaled back appropriately with low arc_max, so we revert to pruning only metadata if arc_max is less than 8G in 3ee97ec648ee61de4fe23f2bec842a90cd65ec69