openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.47k stars 1.73k forks source link

ZFS 2.1.2 + NFSv4 ARC stuck pruning/evicting, flatlines throughput #13079

Closed remingtonc closed 1 year ago

remingtonc commented 2 years ago

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 20.04.3 LTS
Kernel Version 5.4.0-96-generic
Architecture x86_64
OpenZFS Version zfs-2.1.2-1

Describe the problem you're observing

ZFS is flatlined on throughput with an arc_evict and arc_prune process spinning at 100%. The workload is kernel NFS server (all NFSv4 clients) with ZFS 2.1.2 built from source. Characterized by high CPU iowait and throughput flatlining.

RAM Graph

Screen Shot 2022-02-09 at 9 06 55 AM

It's holding on to RAM pretty hard. This is where I begin to lose debugging expertise, having discovered slabs yesterday. :-)

top

top - 16:55:29 up 21 days, 14:56,  1 user,  load average: 195.23, 124.60, 134.24
Tasks: 993 total,   2 running, 991 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  7.3 sy,  0.0 ni,  5.0 id, 87.3 wa,  0.0 hi,  0.4 si,  0.0 st
MiB Mem : 128622.1 total,   2501.0 free, 121637.5 used,   4483.5 buff/cache
MiB Swap:   1907.0 total,   1042.7 free,    864.2 used.   6078.4 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 245944 root      20   0       0      0      0 R 100.0   0.0 549:36.86 arc_evict
 245943 root      20   0       0      0      0 S  93.1   0.0 459:13.30 arc_prune
 246555 root      20   0  953064  20876   2732 S   3.6   0.0  29:32.85 zed
3174406 root      20   0   12784   4872   3248 R   1.3   0.0   0:00.17 top
  14154 root      rt   0  356604  31088   8300 S   0.7   0.0 133:03.60 multipathd
3159749 root      20   0       0      0      0 D   0.7   0.0   0:00.11 nfsd

slabtop

# slabtop
Active / Total Objects (% used)    : 215263329 / 222780310 (96.6%)
 Active / Total Slabs (% used)      : 6020764 / 6020764 (100.0%)
 Active / Total Caches (% used)     : 141 / 199 (70.9%)
 Active / Total Size (% used)       : 98093294.62K / 99549181.36K (98.5%)
 Minimum / Average / Maximum Object : 0.01K / 0.45K / 16.75K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
27437256 27437256 100%    0.97K 831432       33  26605824K dnode_t
25866414 25866282  99%    0.38K 615867       42   9853872K dmu_buf_impl_t
24953184 24952001  99%    0.50K 779787       32  12476592K kmalloc-512
20654336 18708128  90%    0.06K 322724       64   1290896K kmalloc-64
14428134 14148450  98%    0.09K 343527       42   1374108K arc_buf_hdr_t_l2only
10290924 10290229  99%    0.09K 245022       42    980088K kmalloc-96
10017152 9881010  98%    0.03K  78259      128    313036K kmalloc-32
7602798 7575100  99%    0.19K 181019       42   1448152K dentry
7255278 7139757  98%    1.09K 250182       29   8005824K zfs_znode_cache
7225482 7140416  98%    0.24K 218954       33   1751632K sa_cache
5172144 5172144 100%    0.16K 107753       48    862024K nfsd4_stateids
5079424 5077737  99%    0.25K 158732       32   1269856K filp
4988088 4985288  99%    0.19K 118764       42    950112K cred_jar
4952610 4952610 100%    0.02K  29133      170    116532K lsm_file_cache
4951520 4951520 100%    0.28K 176840       28   1414720K nfsd4_files
4949376 4949376 100%    0.03K  38667      128    154668K fsnotify_mark_connector
4947816 4947816 100%    0.08K  97016       51    388064K Acpi-State
4947372 4947372 100%    0.11K 137427       36    549708K khugepaged_mm_slot
4862528 4858723  99%    0.06K  75977       64    303908K kmalloc-rcl-64
3813933 1054784  27%    0.31K  74783       51   1196528K arc_buf_hdr_t_full
3601017 3601017 100%    0.05K  49329       73    197316K nsproxy
3022149 1059649  35%    0.10K  77491       39    309964K abd_t
2598288 2591411  99%    0.57K  92796       28   1484736K radix_tree_node
1523068 1523068 100%    0.42K  41164       37    658624K nfsd4_openowners
1248128 1247029  99%    1.00K  39004       32   1248128K kmalloc-1k
1128681 1052509  93%    0.08K  22131       51     88524K arc_buf_t
960386 960375  99%   16.00K 480193        2  15366176K zio_buf_comb_16384
945488 942004  99%    8.00K 236372        4   7563904K kmalloc-8k
290745 268913  92%    0.10K   7455       39     29820K buffer_head
288000 287664  99%    0.13K   9600       30     38400K kernfs_node_cache
228160 226423  99%    0.06K   3565       64     14260K anon_vma_chain
211894 204295  96%    0.59K   3998       53    127936K inode_cache
209664 209664 100%    0.02K    819      256      3276K kmalloc-16
187432 187432 100%    0.07K   3347       56     13388K Acpi-Operand
169611 168672  99%    0.20K   4349       39     34792K vm_area_struct
121856 121856 100%    0.01K    238      512       952K kmalloc-8
116978 116702  99%    0.09K   2543       46     10172K anon_vma
107856  99165  91%    0.09K   2568       42     10272K kmalloc-rcl-96

zpool

# zpool status pod-10
  pool: pod-10
 state: ONLINE
config:

    NAME                              STATE     READ WRITE CKSUM
    pod-10                            ONLINE       0     0     0
      raidz3-0                        ONLINE       0     0     0
        35000c500ae29a4bb             ONLINE       0     0     0
        35000c500ae95def3             ONLINE       0     0     0
        35000c500ae96d1d7             ONLINE       0     0     0
        35000c500ae9729af             ONLINE       0     0     0
        35000c500ae97296f             ONLINE       0     0     0
        35000c500ae96daeb             ONLINE       0     0     0
        35000c500ae968243             ONLINE       0     0     0
        35000c500ae97269b             ONLINE       0     0     0
        35000c500ae970cd7             ONLINE       0     0     0
        35000c500ae975a7f             ONLINE       0     0     0
        35000c500ae957c3b             ONLINE       0     0     0
      raidz3-1                        ONLINE       0     0     0
        35000c500ae96870f             ONLINE       0     0     0
        35000c500ae2bc057             ONLINE       0     0     0
        35000c500ae9733b7             ONLINE       0     0     0
        35000c500ae2bc4db             ONLINE       0     0     0
        35000c500ae96b4bf             ONLINE       0     0     0
        35000c500ae970e3b             ONLINE       0     0     0
        35000c500ae957bab             ONLINE       0     0     0
        35000c500ae96aa6f             ONLINE       0     0     0
        35000c500ae96833f             ONLINE       0     0     0
        35000c500ae96a4a3             ONLINE       0     0     0
        35000c500ae2a6e9b             ONLINE       0     0     0
      raidz3-2                        ONLINE       0     0     0
        35000c500ae96c3f7             ONLINE       0     0     0
        35000c500ae972ddf             ONLINE       0     0     0
        35000c500ae96bb4f             ONLINE       0     0     0
        35000c500ae95d66f             ONLINE       0     0     0
        35000c500ae96777f             ONLINE       0     0     0
        35000c500ae60fec3             ONLINE       0     0     0
        35000c500ae96fcef             ONLINE       0     0     0
        35000c500ae966b0b             ONLINE       0     0     0
        35000c500ae96c823             ONLINE       0     0     0
        35000c500ae95e363             ONLINE       0     0     0
        35000c500ae96fab3             ONLINE       0     0     0
      raidz3-3                        ONLINE       0     0     0
        35000c500ae34ceeb             ONLINE       0     0     0
        35000c500ae34d4c7             ONLINE       0     0     0
        35000c500ae970af7             ONLINE       0     0     0
        35000c500ae9597cb             ONLINE       0     0     0
        35000c500ae62c433             ONLINE       0     0     0
        35000c500ae968857             ONLINE       0     0     0
        35000c500ae970167             ONLINE       0     0     0
        35000c500ae63517f             ONLINE       0     0     0
        35000c500ae961313             ONLINE       0     0     0
        35000c500ae95d53b             ONLINE       0     0     0
        35000c500ae95cc5b             ONLINE       0     0     0
      raidz3-4                        ONLINE       0     0     0
        35000c500ae9737c3             ONLINE       0     0     0
        35000c500ae970feb             ONLINE       0     0     0
        35000c500ae9686f3             ONLINE       0     0     0
        35000c500ae97387b             ONLINE       0     0     0
        35000c500ae97403f             ONLINE       0     0     0
        35000c500ae95711f             ONLINE       0     0     0
        35000c500ae96cb23             ONLINE       0     0     0
        35000c500ae2a6db7             ONLINE       0     0     0
        35000c500ae9681a3             ONLINE       0     0     0
        35000c500ae9688b7             ONLINE       0     0     0
        35000c500ae97404b             ONLINE       0     0     0
      raidz3-5                        ONLINE       0     0     0
        35000c500ae956b87             ONLINE       0     0     0
        35000c500ae974bf3             ONLINE       0     0     0
        35000c500ae9744fb             ONLINE       0     0     0
        35000c500ae29e993             ONLINE       0     0     0
        35000c500ae96ef57             ONLINE       0     0     0
        35000c500ae974d2b             ONLINE       0     0     0
        35000c500ae970a0f             ONLINE       0     0     0
        35000c500ae39455f             ONLINE       0     0     0
        35000c500ae29d97f             ONLINE       0     0     0
        35000c500ae95712b             ONLINE       0     0     0
        35000c500ae9742c3             ONLINE       0     0     0
      raidz3-6                        ONLINE       0     0     0
        35000c500ae96fc4b             ONLINE       0     0     0
        35000c500ae955eef             ONLINE       0     0     0
        35000c500ae95c243             ONLINE       0     0     0
        35000c500ae974057             ONLINE       0     0     0
        35000c500ae95e4cb             ONLINE       0     0     0
        35000c500ae96eab3             ONLINE       0     0     0
        35000c500ae96c9bb             ONLINE       0     0     0
        35000c500ae959fd7             ONLINE       0     0     0
        35000c500ae2a709f             ONLINE       0     0     0
        35000c500ae96d793             ONLINE       0     0     0
        35000c500ae9728ab             ONLINE       0     0     0
      raidz3-7                        ONLINE       0     0     0
        35000c500ae962117             ONLINE       0     0     0
        35000c500ae4044bf             ONLINE       0     0     0
        35000c500ae393b9b             ONLINE       0     0     0
        35000c500ae974223             ONLINE       0     0     0
        35000c500ae95f193             ONLINE       0     0     0
        35000c500ae957abb             ONLINE       0     0     0
        35000c500ae9563a3             ONLINE       0     0     0
        35000c500ae9683ab             ONLINE       0     0     0
        35000c500ae96df8f             ONLINE       0     0     0
        35000c500ae96d543             ONLINE       0     0     0
        35000c500ae972b2f             ONLINE       0     0     0
      raidz3-8                        ONLINE       0     0     0
        35000c500ae960517             ONLINE       0     0     0
        35000c500ae95555b             ONLINE       0     0     0
        35000c500ae974637             ONLINE       0     0     0
        35000c500ae972dd7             ONLINE       0     0     0
        35000c500ae4323c7             ONLINE       0     0     0
        35000c500ae96d617             ONLINE       0     0     0
        35000c500ae9638b7             ONLINE       0     0     0
        35000c500ae96ea0f             ONLINE       0     0     0
        35000c500ae96e3eb             ONLINE       0     0     0
        35000c500ae29a417             ONLINE       0     0     0
        35000c500ae96fda3             ONLINE       0     0     0
    logs
      mirror-9                        ONLINE       0     0     0
        wwn-0x55cd2e4152220c93-part1  ONLINE       0     0     0
        wwn-0x55cd2e41519b023b-part1  ONLINE       0     0     0
    cache
      wwn-0x55cd2e4152220c93-part2    ONLINE       0     0     0
      wwn-0x55cd2e41519b023b-part2    ONLINE       0     0     0
      wwn-0x55cd2e415221c8c7          ONLINE       0     0     0
    spares
      35000c500ae97206f               AVAIL
      35000c500ae965dbb               AVAIL
      35000c500ae2b926f               AVAIL
      35000c500ae976187               AVAIL
      35000c500ae96dcf7               AVAIL
      35000c500ae958777               AVAIL
      35000c500ae41875f               AVAIL

errors: No known data errors

zfs fs

# zfs list
NAME            USED  AVAIL     REFER  MOUNTPOINT
pod-10          120T   926T      279K  /pod-10
pod-10/pod-10   120T   926T      120T  /srv/pod-10
# zfs get primarycache,secondarycache pod-10/pod-10
NAME           PROPERTY        VALUE           SOURCE
pod-10/pod-10  primarycache    metadata        local
pod-10/pod-10  secondarycache  metadata        local

Intention is for ARC/L2ARC to be entirely metadata.

initial arcstats

arcstats reported metadata usage above the limit.

ARC size (current):                                   102.8 %   64.6 GiB
        Target size (adaptive):                       100.0 %   62.8 GiB
        Min size (hard limit):                          6.2 %    3.9 GiB
        Max size (high water):                           16:1   62.8 GiB
        Most Frequently Used (MFU) cache size:         13.0 %    2.6 GiB
        Most Recently Used (MRU) cache size:           87.0 %   17.7 GiB
        Metadata cache size (hard limit):              75.0 %   47.1 GiB
        Metadata cache size (current):                135.8 %   64.0 GiB
        Dnode cache size (hard limit):                 10.0 %    4.7 GiB
        Dnode cache size (current):                   537.3 %   25.3 GiB

initial adjustment

Attempted to remediate by increasing ARC size by half of remaining RAM and increasing the metadata allocation in the ARC.

echo 86762369024 > /sys/module/zfs/parameters/zfs_arc_max
echo 90 > /sys/module/zfs/parameters/zfs_arc_meta_limit_percent
echo 50 > /sys/module/zfs/parameters/zfs_arc_dnode_limit_percent

current issue

This remediated the issue temporarily, and the prune processes stopped, but we are back! :)

Stopped NFS server and it seems to be free-ing memory albeit very slowly, many dp_sync_taskq processes... But removing the NFS server is very un-ideal. Given they are both living in the kernel, it's difficult for me personally to determine who is eating up the memory.

top - 18:05:52 up 21 days, 16:07,  2 users,  load average: 2.52, 7.31, 35.56
Tasks: 807 total,   1 running, 806 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  1.5 sy,  0.0 ni, 95.9 id,  2.5 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem : 128622.1 total,  17620.3 free, 108471.5 used,   2530.3 buff/cache
MiB Swap:   1907.0 total,   1764.7 free,    142.2 used.  19277.1 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 248929 root      20   0       0      0      0 S   2.3   0.0 349:34.92 txg_sync
 248864 root      39  19       0      0      0 S   1.7   0.0  33:52.18 dp_sync_taskq
1767630 root      20   0       0      0      0 D   1.7   0.0   3:35.82 kworker/23:2+events
 248854 root      39  19       0      0      0 S   1.3   0.0  33:48.50 dp_sync_taskq
 248855 root      39  19       0      0      0 S   1.3   0.0  33:51.96 dp_sync_taskq
 248856 root      39  19       0      0      0 S   1.3   0.0  33:49.72 dp_sync_taskq
 248857 root      39  19       0      0      0 S   1.3   0.0  33:49.15 dp_sync_taskq
 248859 root      39  19       0      0      0 S   1.3   0.0  33:54.62 dp_sync_taskq
 248860 root      39  19       0      0      0 S   1.3   0.0  33:50.09 dp_sync_taskq
 248862 root      39  19       0      0      0 S   1.3   0.0  33:49.23 dp_sync_taskq
 248865 root      39  19       0      0      0 S   1.3   0.0  33:50.26 dp_sync_taskq
 248867 root      39  19       0      0      0 S   1.3   0.0  33:48.57 dp_sync_taskq
 248870 root      39  19       0      0      0 S   1.3   0.0  33:51.20 dp_sync_taskq
 248871 root      39  19       0      0      0 S   1.3   0.0  33:48.09 dp_sync_taskq
 248872 root      39  19       0      0      0 S   1.3   0.0  33:51.06 dp_sync_taskq
 248873 root      39  19       0      0      0 S   1.3   0.0  33:46.16 dp_sync_taskq
# arc_summary

------------------------------------------------------------------------
ZFS Subsystem Report                            Wed Feb 09 16:23:14 2022
Linux 5.4.0-96-generic                                           2.1.2-1
Machine: r8-n9 (x86_64)                                          2.1.2-1

ARC status:                                                      HEALTHY
        Memory throttle count:                                         0

ARC size (current):                                    90.1 %   72.8 GiB
        Target size (adaptive):                         4.9 %    3.9 GiB
        Min size (hard limit):                          4.9 %    3.9 GiB
        Max size (high water):                           20:1   80.8 GiB
        Most Frequently Used (MFU) cache size:         13.8 %    3.0 GiB
        Most Recently Used (MRU) cache size:           86.2 %   18.7 GiB
        Metadata cache size (hard limit):              90.0 %   72.7 GiB
        Metadata cache size (current):                100.2 %   72.8 GiB
        Dnode cache size (hard limit):                 50.0 %   36.4 GiB
        Dnode cache size (current):                    80.9 %   29.4 GiB

ARC hash breakdown:
        Elements max:                                              15.7M
        Elements current:                              97.2 %      15.2M
        Collisions:                                                 1.1G
        Chain max:                                                    10
        Chains:                                                     3.9M

ARC misc:
        Deleted:                                                   28.5M
        Mutex misses:                                               6.6G
        Eviction skips:                                            93.2G
        Eviction skips due to L2 writes:                           10.3k
        L2 cached evictions:                                     1.4 TiB
        L2 eligible evictions:                                  73.1 GiB
        L2 eligible MFU evictions:                      4.8 %    3.5 GiB
        L2 eligible MRU evictions:                     95.2 %   69.6 GiB
        L2 ineligible evictions:                               707.3 GiB

ARC total accesses (hits + misses):                                11.3G
        Cache hit ratio:                               86.4 %       9.8G
        Cache miss ratio:                              13.6 %       1.5G
        Actual hit ratio (MFU + MRU hits):             86.3 %       9.8G
        Data demand efficiency:                         1.6 %       1.6G
        Data prefetch efficiency:                         n/a          0

Cache hits by cache type:
        Most frequently used (MFU):                    96.3 %       9.4G
        Most recently used (MRU):                       3.6 %     356.4M
        Most frequently used (MFU) ghost:             < 0.1 %       1.3M
        Most recently used (MRU) ghost:               < 0.1 %       1.0M

Cache hits by data type:
        Demand data:                                    0.3 %      25.4M
        Demand prefetch data:                           0.0 %          0
        Demand metadata:                               99.7 %       9.8G
        Demand prefetch metadata:                     < 0.1 %       2.4M

Cache misses by data type:
        Demand data:                                   99.3 %       1.5G
        Demand prefetch data:                           0.0 %          0
        Demand metadata:                                0.4 %       5.9M
        Demand prefetch metadata:                       0.3 %       4.8M

DMU prefetch efficiency:                                            2.7G
        Hit ratio:                                      5.7 %     152.9M
        Miss ratio:                                    94.3 %       2.5G

L2ARC status:                                                    HEALTHY
        Low memory aborts:                                         24.5k
        Free on write:                                               281
        R/W clashes:                                                   0
        Bad checksums:                                                 0
        I/O errors:                                                    0

L2ARC size (adaptive):                                           1.3 TiB
        Compressed:                                     5.6 %   74.6 GiB
        Header size:                                    0.1 %    1.3 GiB
        MFU allocated size:                            19.4 %   14.5 GiB
        MRU allocated size:                            85.0 %   63.4 GiB
        Prefetch allocated size:                        0.1 %   67.6 MiB
        Data (buffer content) allocated size:           0.0 %    0 Bytes
        Metadata (buffer content) allocated size:     104.5 %   77.9 GiB

L2ARC breakdown:                                                    1.5G
        Hit ratio:                                      0.2 %       3.8M
        Miss ratio:                                    99.8 %       1.5G
        Feeds:                                                      1.8M

L2ARC writes:
        Writes sent:                                    100 %       1.4M

L2ARC evicts:
        Lock retries:                                                  0
        Upon reading:                                                  0

Solaris Porting Layer (SPL):
        spl_hostid                                                     0
        spl_hostid_path                                      /etc/hostid
        spl_kmem_alloc_max                                       1048576
        spl_kmem_alloc_warn                                        65536
        spl_kmem_cache_kmem_threads                                    4
        spl_kmem_cache_magazine_size                                   0
        spl_kmem_cache_max_size                                       32
        spl_kmem_cache_obj_per_slab                                    8
        spl_kmem_cache_reclaim                                         0
        spl_kmem_cache_slab_limit                                  16384
        spl_max_show_tasks                                           512
        spl_panic_halt                                                 0
        spl_schedule_hrtimeout_slack_us                                0
        spl_taskq_kick                                                 0
        spl_taskq_thread_bind                                          0
        spl_taskq_thread_dynamic                                       1
        spl_taskq_thread_priority                                      1
        spl_taskq_thread_sequential                                    4

Tunables:
        dbuf_cache_hiwater_pct                                        10
        dbuf_cache_lowater_pct                                        10
        dbuf_cache_max_bytes                        18446744073709551615
        dbuf_cache_shift                                               5
        dbuf_metadata_cache_max_bytes               18446744073709551615
        dbuf_metadata_cache_shift                                      6
        dmu_object_alloc_chunk_shift                                   7
        dmu_prefetch_max                                       134217728
        ignore_hole_birth                                              1
        l2arc_feed_again                                               1
        l2arc_feed_min_ms                                            200
        l2arc_feed_secs                                                1
        l2arc_headroom                                                 2
        l2arc_headroom_boost                                         200
        l2arc_meta_percent                                            33
        l2arc_mfuonly                                                  0
        l2arc_noprefetch                                               1
        l2arc_norw                                                     0
        l2arc_rebuild_blocks_min_l2size                       1073741824
        l2arc_rebuild_enabled                                          1
        l2arc_trim_ahead                                               0
        l2arc_write_boost                                        8388608
        l2arc_write_max                                          8388608
        metaslab_aliquot                                          524288
        metaslab_bias_enabled                                          1
        metaslab_debug_load                                            0
        metaslab_debug_unload                                          0
        metaslab_df_max_search                                  16777216
        metaslab_df_use_largest_segment                                0
        metaslab_force_ganging                                  16777217
        metaslab_fragmentation_factor_enabled                          1
        metaslab_lba_weighting_enabled                                 1
        metaslab_preload_enabled                                       1
        metaslab_unload_delay                                         32
        metaslab_unload_delay_ms                                  600000
        send_holes_without_birth_time                                  1
        spa_asize_inflation                                           24
        spa_config_path                             /etc/zfs/zpool.cache
        spa_load_print_vdev_tree                                       0
        spa_load_verify_data                                           1
        spa_load_verify_metadata                                       1
        spa_load_verify_shift                                          4
        spa_slop_shift                                                 5
        vdev_file_logical_ashift                                       9
        vdev_file_physical_ashift                                      9
        vdev_removal_max_span                                      32768
        vdev_validate_skip                                             0
        zap_iterate_prefetch                                           1
        zfetch_array_rd_sz                                       1048576
        zfetch_max_distance                                      8388608
        zfetch_max_idistance                                    67108864
        zfetch_max_streams                                             8
        zfetch_min_sec_reap                                            2
        zfs_abd_scatter_enabled                                        1
        zfs_abd_scatter_max_order                                     10
        zfs_abd_scatter_min_size                                    1536
        zfs_admin_snapshot                                             0
        zfs_allow_redacted_dataset_mount                               0
        zfs_arc_average_blocksize                                   8192
        zfs_arc_dnode_limit                                            0
        zfs_arc_dnode_limit_percent                                   50
        zfs_arc_dnode_reduce_percent                                  10
        zfs_arc_evict_batch_limit                                     10
        zfs_arc_eviction_pct                                         200
        zfs_arc_grow_retry                                             0
        zfs_arc_lotsfree_percent                                      10
        zfs_arc_max                                          86762369024
        zfs_arc_meta_adjust_restarts                                4096
        zfs_arc_meta_limit                                             0
        zfs_arc_meta_limit_percent                                    90
        zfs_arc_meta_min                                               0
        zfs_arc_meta_prune                                         10000
        zfs_arc_meta_strategy                                          1
        zfs_arc_min                                                    0
        zfs_arc_min_prefetch_ms                                        0
        zfs_arc_min_prescient_prefetch_ms                              0
        zfs_arc_p_dampener_disable                                     1
        zfs_arc_p_min_shift                                            0
        zfs_arc_pc_percent                                             0
        zfs_arc_shrink_shift                                           0
        zfs_arc_shrinker_limit                                     10000
        zfs_arc_sys_free                                               0
        zfs_async_block_max_blocks                  18446744073709551615
        zfs_autoimport_disable                                         1
        zfs_checksum_events_per_second                                20
        zfs_commit_timeout_pct                                         5
        zfs_compressed_arc_enabled                                     1
        zfs_condense_indirect_commit_entry_delay_ms                    0
        zfs_condense_indirect_obsolete_pct                            25
        zfs_condense_indirect_vdevs_enable                             1
        zfs_condense_max_obsolete_bytes                       1073741824
        zfs_condense_min_mapping_bytes                            131072
        zfs_dbgmsg_enable                                              1
        zfs_dbgmsg_maxsize                                       4194304
        zfs_dbuf_state_index                                           0
        zfs_ddt_data_is_special                                        1
        zfs_deadman_checktime_ms                                   60000
        zfs_deadman_enabled                                            1
        zfs_deadman_failmode                                        wait
        zfs_deadman_synctime_ms                                   600000
        zfs_deadman_ziotime_ms                                    300000
        zfs_dedup_prefetch                                             0
        zfs_delay_min_dirty_percent                                   60
        zfs_delay_scale                                           500000
        zfs_delete_blocks                                          20480
        zfs_dirty_data_max                                    4294967296
        zfs_dirty_data_max_max                                4294967296
        zfs_dirty_data_max_max_percent                                25
        zfs_dirty_data_max_percent                                    10
        zfs_dirty_data_sync_percent                                   20
        zfs_disable_ivset_guid_check                                   0
        zfs_dmu_offset_next_sync                                       0
        zfs_embedded_slog_min_ms                                      64
        zfs_expire_snapshot                                          300
        zfs_fallocate_reserve_percent                                110
        zfs_flags                                                      0
        zfs_free_bpobj_enabled                                         1
        zfs_free_leak_on_eio                                           0
        zfs_free_min_time_ms                                        1000
        zfs_history_output_max                                   1048576
        zfs_immediate_write_sz                                     32768
        zfs_initialize_chunk_size                                1048576
        zfs_initialize_value                        16045690984833335022
        zfs_keep_log_spacemaps_at_export                               0
        zfs_key_max_salt_uses                                  400000000
        zfs_livelist_condense_new_alloc                                0
        zfs_livelist_condense_sync_cancel                              0
        zfs_livelist_condense_sync_pause                               0
        zfs_livelist_condense_zthr_cancel                              0
        zfs_livelist_condense_zthr_pause                               0
        zfs_livelist_max_entries                                  500000
        zfs_livelist_min_percent_shared                               75
        zfs_lua_max_instrlimit                                 100000000
        zfs_lua_max_memlimit                                   104857600
        zfs_max_async_dedup_frees                                 100000
        zfs_max_log_walking                                            5
        zfs_max_logsm_summary_length                                  10
        zfs_max_missing_tvds                                           0
        zfs_max_nvlist_src_size                                        0
        zfs_max_recordsize                                       1048576
        zfs_metaslab_find_max_tries                                  100
        zfs_metaslab_fragmentation_threshold                          70
        zfs_metaslab_max_size_cache_sec                             3600
        zfs_metaslab_mem_limit                                        25
        zfs_metaslab_segment_weight_enabled                            1
        zfs_metaslab_switch_threshold                                  2
        zfs_metaslab_try_hard_before_gang                              0
        zfs_mg_fragmentation_threshold                                95
        zfs_mg_noalloc_threshold                                       0
        zfs_min_metaslabs_to_flush                                     1
        zfs_multihost_fail_intervals                                  10
        zfs_multihost_history                                          0
        zfs_multihost_import_intervals                                20
        zfs_multihost_interval                                      1000
        zfs_multilist_num_sublists                                     0
        zfs_no_scrub_io                                                0
        zfs_no_scrub_prefetch                                          0
        zfs_nocacheflush                                               0
        zfs_nopwrite_enabled                                           1
        zfs_object_mutex_size                                         64
        zfs_obsolete_min_time_ms                                     500
        zfs_override_estimate_recordsize                               0
        zfs_pd_bytes_max                                        52428800
        zfs_per_txg_dirty_frees_percent                                5
        zfs_prefetch_disable                                           0
        zfs_read_history                                               0
        zfs_read_history_hits                                          0
        zfs_rebuild_max_segment                                  1048576
        zfs_rebuild_scrub_enabled                                      1
        zfs_rebuild_vdev_limit                                  33554432
        zfs_reconstruct_indirect_combinations_max                   4096
        zfs_recover                                                    0
        zfs_recv_queue_ff                                             20
        zfs_recv_queue_length                                   16777216
        zfs_recv_write_batch_size                                1048576
        zfs_removal_ignore_errors                                      0
        zfs_removal_suspend_progress                                   0
        zfs_remove_max_segment                                  16777216
        zfs_resilver_disable_defer                                     0
        zfs_resilver_min_time_ms                                    3000
        zfs_scan_checkpoint_intval                                  7200
        zfs_scan_fill_weight                                           3
        zfs_scan_ignore_errors                                         0
        zfs_scan_issue_strategy                                        0
        zfs_scan_legacy                                                0
        zfs_scan_max_ext_gap                                     2097152
        zfs_scan_mem_lim_fact                                         20
        zfs_scan_mem_lim_soft_fact                                    20
        zfs_scan_strict_mem_lim                                        0
        zfs_scan_suspend_progress                                      0
        zfs_scan_vdev_limit                                      4194304
        zfs_scrub_min_time_ms                                       1000
        zfs_send_corrupt_data                                          0
        zfs_send_no_prefetch_queue_ff                                 20
        zfs_send_no_prefetch_queue_length                        1048576
        zfs_send_queue_ff                                             20
        zfs_send_queue_length                                   16777216
        zfs_send_unmodified_spill_blocks                               1
        zfs_slow_io_events_per_second                                 20
        zfs_spa_discard_memory_limit                            16777216
        zfs_special_class_metadata_reserve_pct                        25
        zfs_sync_pass_deferred_free                                    2
        zfs_sync_pass_dont_compress                                    8
        zfs_sync_pass_rewrite                                          2
        zfs_sync_taskq_batch_pct                                      75
        zfs_traverse_indirect_prefetch_limit                          32
        zfs_trim_extent_bytes_max                              134217728
        zfs_trim_extent_bytes_min                                  32768
        zfs_trim_metaslab_skip                                         0
        zfs_trim_queue_limit                                          10
        zfs_trim_txg_batch                                            32
        zfs_txg_history                                              100
        zfs_txg_timeout                                                5
        zfs_unflushed_log_block_max                               262144
        zfs_unflushed_log_block_min                                 1000
        zfs_unflushed_log_block_pct                                  400
        zfs_unflushed_max_mem_amt                             1073741824
        zfs_unflushed_max_mem_ppm                                   1000
        zfs_unlink_suspend_progress                                    0
        zfs_user_indirect_is_special                                   1
        zfs_vdev_aggregate_trim                                        0
        zfs_vdev_aggregation_limit                               1048576
        zfs_vdev_aggregation_limit_non_rotating                   131072
        zfs_vdev_async_read_max_active                                 3
        zfs_vdev_async_read_min_active                                 1
        zfs_vdev_async_write_active_max_dirty_percent                 60
        zfs_vdev_async_write_active_min_dirty_percent                 30
        zfs_vdev_async_write_max_active                               10
        zfs_vdev_async_write_min_active                                2
        zfs_vdev_cache_bshift                                         16
        zfs_vdev_cache_max                                         16384
        zfs_vdev_cache_size                                            0
        zfs_vdev_default_ms_count                                    200
        zfs_vdev_default_ms_shift                                     29
        zfs_vdev_initializing_max_active                               1
        zfs_vdev_initializing_min_active                               1
        zfs_vdev_max_active                                         1000
        zfs_vdev_max_auto_ashift                                      16
        zfs_vdev_min_auto_ashift                                       9
        zfs_vdev_min_ms_count                                         16
        zfs_vdev_mirror_non_rotating_inc                               0
        zfs_vdev_mirror_non_rotating_seek_inc                          1
        zfs_vdev_mirror_rotating_inc                                   0
        zfs_vdev_mirror_rotating_seek_inc                              5
        zfs_vdev_mirror_rotating_seek_offset                     1048576
        zfs_vdev_ms_count_limit                                   131072
        zfs_vdev_nia_credit                                            5
        zfs_vdev_nia_delay                                             5
        zfs_vdev_queue_depth_pct                                    1000
        zfs_vdev_raidz_impl cycle [fastest] original scalar sse2 ssse3 avx2 avx512f avx512bw
        zfs_vdev_read_gap_limit                                    32768
        zfs_vdev_rebuild_max_active                                    3
        zfs_vdev_rebuild_min_active                                    1
        zfs_vdev_removal_max_active                                    2
        zfs_vdev_removal_min_active                                    1
        zfs_vdev_scheduler                                        unused
        zfs_vdev_scrub_max_active                                      3
        zfs_vdev_scrub_min_active                                      1
        zfs_vdev_sync_read_max_active                                 10
        zfs_vdev_sync_read_min_active                                 10
        zfs_vdev_sync_write_max_active                                10
        zfs_vdev_sync_write_min_active                                10
        zfs_vdev_trim_max_active                                       2
        zfs_vdev_trim_min_active                                       1
        zfs_vdev_write_gap_limit                                    4096
        zfs_vnops_read_chunk_size                                1048576
        zfs_zevent_len_max                                           512
        zfs_zevent_retain_expire_secs                                900
        zfs_zevent_retain_max                                       2000
        zfs_zil_clean_taskq_maxalloc                             1048576
        zfs_zil_clean_taskq_minalloc                                1024
        zfs_zil_clean_taskq_nthr_pct                                 100
        zil_maxblocksize                                          131072
        zil_nocacheflush                                               0
        zil_replay_disable                                             0
        zil_slog_bulk                                             786432
        zio_deadman_log_all                                            0
        zio_dva_throttle_enabled                                       1
        zio_requeue_io_start_cut_in_line                               1
        zio_slow_io_ms                                             30000
        zio_taskq_batch_pct                                           80
        zio_taskq_batch_tpq                                            0
        zvol_inhibit_dev                                               0
        zvol_major                                                   230
        zvol_max_discard_blocks                                    16384
        zvol_prefetch_bytes                                       131072
        zvol_request_sync                                              0
        zvol_threads                                                  32
        zvol_volmode                                                   1

VDEV cache disabled, skipping section

ZIL committed transactions:                                         3.2G
        Commit requests:                                          404.3M
        Flushes to stable storage:                                369.8M
        Transactions to SLOG storage pool:          167.6 TiB       1.6G
        Transactions to non-SLOG storage pool:        0 Bytes          0

zed logs

-- Logs begin at Wed 2022-01-12 17:47:33 UTC, end at Wed 2022-02-09 17:58:06 UTC. --
Jan 19 02:15:35 r8-n9 systemd[1]: Started ZFS Event Daemon (zed).
Jan 19 02:15:35 r8-n9 zed[246555]: ZFS Event Daemon 2.1.2-1 (PID 246555)
Jan 19 02:15:35 r8-n9 zed[246555]: Processing events since eid=0
Jan 19 02:16:16 r8-n9 zed[249150]: eid=38 class=config_sync pool='pod-10'
Jan 19 02:25:41 r8-n9 zed[256421]: eid=45 class=vdev_add pool='pod-10'
Feb 09 14:46:20 r8-n9 zed[1838695]: eid=58 class=delay pool='pod-10' vdev=35000c500ae95cc5b size=4096 offset=4740717096960 priority=0 err=0 flags=0x180980 delay=30049ms bookmark=269:0:0:1036122
Feb 09 14:46:20 r8-n9 zed[1838693]: eid=57 class=delay pool='pod-10' vdev=35000c500ae970e3b size=20480 offset=4785921613824 priority=0 err=0 flags=0x40080c80 delay=30174ms
Feb 09 14:46:20 r8-n9 zed[1838700]: eid=60 class=delay pool='pod-10' vdev=35000c500ae34d4c7 size=4096 offset=4731204771840 priority=0 err=0 flags=0x180980 delay=30049ms bookmark=269:0:0:1059557
Feb 09 14:46:20 r8-n9 zed[1838703]: eid=62 class=delay pool='pod-10' vdev=35000c500ae96a4a3 size=4096 offset=4785771388928 priority=0 err=0 flags=0x180980 delay=30049ms bookmark=269:0:0:1033656
Feb 09 14:46:21 r8-n9 zed[1838803]: eid=70 class=delay pool='pod-10' vdev=35000c500ae95d66f size=4096 offset=4774821298176 priority=0 err=0 flags=0x180980 delay=30547ms bookmark=269:0:0:1041464
Feb 09 14:46:21 r8-n9 zed[1838814]: eid=76 class=delay pool='pod-10' vdev=35000c500ae62c433 size=4096 offset=4721981607936 priority=0 err=0 flags=0x180980 delay=30159ms bookmark=269:0:0:1044753
Feb 09 14:46:22 r8-n9 zed[1838836]: eid=77 class=delay pool='pod-10' vdev=35000c500ae29d97f size=4096 offset=4781092179968 priority=0 err=0 flags=0x180980 delay=30374ms bookmark=269:0:0:1053826
Feb 09 14:46:22 r8-n9 zed[1838842]: eid=78 class=delay pool='pod-10' vdev=35000c500ae29a4bb size=4096 offset=4776777699328 priority=0 err=0 flags=0x180980 delay=30374ms bookmark=269:0:0:1031840
Feb 09 14:46:22 r8-n9 zed[1838846]: eid=82 class=delay pool='pod-10' vdev=35000c500ae96aa6f size=4096 offset=4786110545920 priority=0 err=0 flags=0x180980 delay=30374ms bookmark=269:0:0:1033105
Feb 09 14:46:22 r8-n9 zed[1838875]: eid=84 class=delay pool='pod-10' vdev=35000c500ae95555b size=53248 offset=4774958784512 priority=0 err=0 flags=0x40080c80 delay=30460ms
Feb 09 14:46:22 r8-n9 zed[1838878]: eid=83 class=delay pool='pod-10' vdev=35000c500ae957c3b size=4096 offset=4743807332352 priority=0 err=0 flags=0x180980 delay=30668ms bookmark=269:0:0:1054047
Feb 09 14:46:22 r8-n9 zed[1838884]: eid=87 class=delay pool='pod-10' vdev=35000c500ae96d793 size=4096 offset=4768750641152 priority=0 err=0 flags=0x180980 delay=30444ms bookmark=269:0:0:1042517
Feb 09 14:46:22 r8-n9 zed[1838935]: eid=89 class=delay pool='pod-10' vdev=35000c500ae972b2f size=4096 offset=4769971810304 priority=0 err=0 flags=0x180980 delay=30623ms bookmark=269:0:0:1033391
Feb 09 14:46:22 r8-n9 zed[1838942]: eid=91 class=delay pool='pod-10' vdev=35000c500ae9737c3 size=4096 offset=4728066199552 priority=0 err=0 flags=0x180980 delay=30623ms bookmark=269:0:0:1057873
Feb 09 14:46:22 r8-n9 zed[1838963]: eid=94 class=delay pool='pod-10' vdev=35000c500ae95d53b size=4096 offset=4712411389952 priority=0 err=0 flags=0x180980 delay=30831ms bookmark=269:0:0:1049636
Feb 09 14:46:22 r8-n9 zed[1838967]: eid=95 class=delay pool='pod-10' vdev=35000c500ae970cd7 size=4096 offset=4776225902592 priority=0 err=0 flags=0x180980 delay=30626ms bookmark=269:0:0:1046985
Feb 09 14:46:22 r8-n9 zed[1838972]: eid=97 class=delay pool='pod-10' vdev=35000c500ae2a6e9b size=4096 offset=4782662496256 priority=0 err=0 flags=0x180980 delay=30831ms bookmark=269:0:0:1048773
Feb 09 14:46:22 r8-n9 zed[1838969]: eid=96 class=delay pool='pod-10' vdev=35000c500ae955eef size=20480 offset=4774534471680 priority=0 err=0 flags=0x40080c80 delay=30626ms

vmstat

# cat /proc/vmstat
nr_free_pages 497162
nr_zone_inactive_anon 77677
nr_zone_active_anon 81958
nr_zone_inactive_file 16392
nr_zone_active_file 10007
nr_zone_unevictable 7960
nr_zone_write_pending 844
nr_mlock 7960
nr_page_table_pages 1963
nr_kernel_stack 19296
nr_bounce 0
nr_zspages 0
nr_free_cma 0
numa_hit 151121274140
numa_miss 652564522
numa_foreign 652564522
numa_interleave 96982
numa_local 133392770941
numa_other 18381067721
nr_inactive_anon 77677
nr_active_anon 81958
nr_inactive_file 16392
nr_active_file 10007
nr_unevictable 7960
nr_slab_reclaimable 1104575
nr_slab_unreclaimable 28741700
nr_isolated_anon 0
nr_isolated_file 0
workingset_nodes 1973
workingset_refault 78664
workingset_activate 26788
workingset_restore 14173
workingset_nodereclaim 816
nr_anon_pages 162434
nr_mapped 20118
nr_file_pages 32246
nr_dirty 844
nr_writeback 0
nr_writeback_temp 0
nr_shmem 369
nr_shmem_hugepages 0
nr_shmem_pmdmapped 0
nr_file_hugepages 0
nr_file_pmdmapped 0
nr_anon_transparent_hugepages 0
nr_unstable 0
nr_vmscan_write 68263
nr_vmscan_immediate_reclaim 641
nr_dirtied 3379059
nr_written 3113482
nr_kernel_misc_reclaimable 0
nr_dirty_threshold 73743
nr_dirty_background_threshold 36826
pgpgin 194613865828
pgpgout 594833359918
pswpin 4218
pswpout 68258
pgalloc_dma 0
pgalloc_dma32 324399540
pgalloc_normal 245349772461
pgalloc_movable 0
allocstall_dma 0
allocstall_dma32 0
allocstall_normal 1
allocstall_movable 2
pgskip_dma 0
pgskip_dma32 0
pgskip_normal 0
pgskip_movable 0
pgfree 245674776191
pgactivate 730915
pgdeactivate 358591
pglazyfree 6389
pgfault 243384077
pgmajfault 15967
pglazyfreed 0
pgrefill 382033
pgsteal_kswapd 292238
pgsteal_direct 1375
pgscan_kswapd 477828
pgscan_direct 1399
pgscan_direct_throttle 0
zone_reclaim_failed 0
pginodesteal 74
slabs_scanned 459729476
kswapd_inodesteal 26293
kswapd_low_wmark_hit_quickly 1
kswapd_high_wmark_hit_quickly 14
pageoutrun 277
pgrotated 69682
drop_pagecache 1
drop_slab 1
oom_kill 0
numa_pte_updates 4332995
numa_huge_pte_updates 12
numa_hint_faults 3863526
numa_hint_faults_local 3281256
numa_pages_migrated 327088
pgmigrate_success 411548
pgmigrate_fail 3419
compact_migrate_scanned 10896483
compact_free_scanned 460891
compact_isolated 170144
compact_stall 4
compact_fail 4
compact_success 0
compact_daemon_wake 255
compact_daemon_migrate_scanned 1507268
compact_daemon_free_scanned 317747
htlb_buddy_alloc_success 0
htlb_buddy_alloc_fail 0
unevictable_pgs_culled 81639
unevictable_pgs_scanned 0
unevictable_pgs_rescued 14620
unevictable_pgs_mlocked 26080
unevictable_pgs_munlocked 16730
unevictable_pgs_cleared 1390
unevictable_pgs_stranded 1390
thp_fault_alloc 10
thp_fault_fallback 0
thp_collapse_alloc 8
thp_collapse_alloc_failed 2
thp_file_alloc 0
thp_file_mapped 0
thp_split_page 0
thp_split_page_failed 0
thp_deferred_split_page 18
thp_split_pmd 9
thp_split_pud 0
thp_zero_page_alloc 0
thp_zero_page_alloc_failed 0
thp_swpout 0
thp_swpout_fallback 0
balloon_inflate 0
balloon_deflate 0
balloon_migrate 0
swap_ra 1519
swap_ra_hit 833

buddyinfo

Node: 0
 Zone: DMA
 Free KiB in zone: 15876.00
    Fragment size        Free fragments       Total available KiB
    4096                 1                    4.0
    8192                 0                    0.0
    16384                0                    0.0
    32768                0                    0.0
    65536                2                    128.0
    131072               1                    128.0
    262144               1                    256.0
    524288               0                    0.0
    1048576              1                    1024.0
    2097152              1                    2048.0
    4194304              3                    12288.0
 Zone: DMA32
 Free KiB in zone: 252500.00
    Fragment size        Free fragments       Total available KiB
    4096                 1923                 7692.0
    8192                 5143                 41144.0
    16384                2581                 41296.0
    32768                142                  4544.0
    65536                26                   1664.0
    131072               16                   2048.0
    262144               78                   19968.0
    524288               70                   35840.0
    1048576              36                   36864.0
    2097152              22                   45056.0
    4194304              4                    16384.0
 Zone: Normal
 Free KiB in zone: 107160.00
    Fragment size        Free fragments       Total available KiB
    4096                 1586                 6344.0
    8192                 8328                 66624.0
    16384                1987                 31792.0
    32768                27                   864.0
    65536                24                   1536.0
    131072               0                    0.0
    262144               0                    0.0
    524288               0                    0.0
    1048576              0                    0.0
    2097152              0                    0.0
    4194304              0                    0.0
Node: 1
 Zone: Normal
 Free KiB in zone: 2152516.00
    Fragment size        Free fragments       Total available KiB
    4096                 149157               596628.0
    8192                 107244               857952.0
    16384                3545                 56720.0
    32768                4110                 131520.0
    65536                3156                 201984.0
    131072               1610                 206080.0
    262144               297                  76032.0
    524288               48                   24576.0
    1048576              1                    1024.0
    2097152              0                    0.0
    4194304              0                    0.0

Describe how to reproduce the problem

Uncertain but it has occurred on 2 separate servers so it is likely to happen again.

Include any warning/errors/backtraces from the system logs

Related to:

6223

9966

3157

7559

remingtonc commented 2 years ago

(related tickets moved to description)

remingtonc commented 2 years ago
Screen Shot 2022-02-09 at 11 40 45 AM

More graphs - this shows sawtooth trend on first server this occurred on. Second server demonstrated very similar memory pattern (currently detailed).

This second graph shows current server, where ARC size was increased.

Screen Shot 2022-02-09 at 11 42 36 AM
remingtonc commented 2 years ago

Issue happened again. At least the issue is repeatable! Stopping the NFS server results in the ARC reducing in size - so this seems like contention somewhere. The write load is much higher than the read load on these machines, is the ARC used in anything other than read? If NFS is somehow pinning ZFS information in ARC, it would be great to understand how that confluence would work and why it would flatline the ability to write data..and even read! Can the L2ARC create a stall condition somehow as well?

atonkyra commented 2 years ago

I've seen this happen too when I drop caches on a loaded NFS server. Only way to recover seems to be a stop/start of nfs-kernel-server.

I wonder if #13231 is related and if increasing zfs_arc_prune_task_threads would help...?

obrienmd commented 2 years ago

I have the same issue, though I'm not using NFS server. My basic setup / load (on zvols+xfs) is as follows:

zpool create tank -f -o ashift=12 -o autotrim=off -O relatime=on -O xattr=sa -O dnodesize=auto -O compression=lz4 [56 mirrored pairs of 16TB SAS HDDs] spare [8 spare 16TB SAS HDDs] special [3-way mirror of enterprise NVMe SSDs] log [3-way mirror of Optane NVMe SSDs]
zfs create tank/test -o recordsize=64K
zfs create tank/test/vol -o volblocksize=64K -V 500G -s
mkfs.xfs -m crc=1,reflink=1 -f -K -d su=64k,sw=1 -size=4k /dev/zvol/tank/test/vol
mount /dev/zvol/tank/test/vol /mnt
cd /mnt
fio --runtime=3600 --direct=1 --bsrange=12k/5:48k/50:132k/20:216k/25 --ioengine=libaio --time_based --name=test --iodepth=96 --rw=randread

I'm on Ubuntu 22.04 with OpenZFS 2.1.2. I've tried to bump zfs_arc_prune_task_threads to 2, 4, and 8, with no improvement.

Super happy to run test cases / debugging if anyone has suggestions!

obrienmd commented 2 years ago

Thought this was kind of interesting - it's possible that I'm misunderstanding arc_summary output, but when I get the arc_evict 100% cpu condition, the numbers don't really add up. Example follows - ARC is at 65.1 GiB, but MFU+MRU+Meta+Dnode = ~12GiB.

ARC size (current):                                   101.7 %   65.1 GiB
        Target size (adaptive):                       100.0 %   64.0 GiB
        Min size (hard limit):                         12.5 %    8.0 GiB
        Max size (high water):                            8:1   64.0 GiB
        Most Frequently Used (MFU) cache size:         95.7 %    8.9 GiB
        Most Recently Used (MRU) cache size:            4.3 %  407.7 MiB
        Metadata cache size (hard limit):              75.0 %   48.0 GiB
        Metadata cache size (current):                  6.3 %    3.0 GiB
        Dnode cache size (hard limit):                 10.0 %    4.8 GiB
        Dnode cache size (current):                   < 0.1 %  718.8 KiB
stale[bot] commented 1 year ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.