Closed jtblck90 closed 2 weeks ago
Likely duplicate #6783 and #16037.
Can you post /proc/spl/kmem/slab
from before and after the OOM event? Doesn't need to be exact, but I'd like to see what happens as more files are deleted, into the kernel attempting to reclaim memory, before finally giving up and killing things.
@robn yeah, saw those issues but decided to post on a fresh rc2. I will post /proc/spl/kmem/slab shortly.
@robn please find the /proc/spl/kmem/slab
output below. Also captured slab before starting files removal.
slab before starting files removal:
--------------------- cache ------------------------------------------------------- ----- slab ------ ---- object ----- --- emergency ---
name flags size alloc slabsize objsize total alloc max total alloc max dlock alloc max
spl_zlib_workspace_cache 0x200080 0 0 2145216 268104 0 0 0 0 0 0 0 0 0
kcf_context_cache 0x00100 - 0 - 40 - - - - 0 - - - -
zfs_btree_leaf_cache 0x00100 - 10604544 - 4096 - - - - 2589 - - - -
metaslab_alloc_trace_cache 0x00100 - 0 - 72 - - - - 0 - - - -
brt_entry_cache 0x00100 - 0 - 40 - - - - 0 - - - -
brt_pending_entry_cache 0x00100 - 0 - 160 - - - - 0 - - - -
ddt_cache 0x00080 468992 234112 234496 29264 2 1 1 16 8 8 0 0 0
ddt_entry_flat_cache 0x00100 - 0 - 240 - - - - 0 - - - -
ddt_entry_trad_cache 0x00100 - 0 - 424 - - - - 0 - - - -
ddt_log_entry_flat_cache 0x00100 - 15686543808 - 144 - - - - 108934332 - - - -
ddt_log_entry_trad_cache 0x00100 - 0 - 328 - - - - 0 - - - -
zio_cache 0x00100 - 43776 - 1216 - - - - 36 - - - -
zio_link_cache 0x00100 - 0 - 48 - - - - 0 - - - -
zio_buf_comb_512 0x200102 - 8704 - 512 - - - - 17 - - - -
zio_buf_comb_1024 0x200102 - 0 - 1024 - - - - 0 - - - -
zio_buf_comb_1536 0x00102 - 1536 - 1536 - - - - 1 - - - -
zio_buf_comb_2048 0x00102 - 2048 - 2048 - - - - 1 - - - -
zio_buf_comb_3072 0x00102 - 3072 - 3072 - - - - 1 - - - -
zio_buf_comb_4096 0x00102 - 8192 - 4096 - - - - 2 - - - -
zio_buf_comb_6144 0x00102 - 0 - 6144 - - - - 0 - - - -
zio_buf_comb_8192 0x00102 - 0 - 8192 - - - - 0 - - - -
zio_buf_comb_12288 0x00102 - 0 - 12288 - - - - 0 - - - -
zio_buf_comb_16384 0x00102 - 1441792 - 16384 - - - - 88 - - - -
zio_buf_comb_24576 0x00082 11440128 9437184 233472 24576 49 48 48 392 384 384 0 0 0
zio_buf_comb_32768 0x00082 1278558208 1004273664 299008 32768 4276 4276 4701 34208 30648 37608 0 0 0
zio_buf_comb_49152 0x00082 25374720 22806528 430080 49152 59 58 58 472 464 464 0 0 0
zio_buf_comb_65536 0x00082 18518016 16777216 561152 65536 33 32 32 264 256 256 0 0 0
zio_buf_comb_98304 0x00082 0 0 823296 98304 0 0 0 0 0 0 0 0 0
zio_buf_comb_131072 0x00082 688168960 577765376 1085440 131072 634 634 648 5072 4408 5184 0 0 0
zio_buf_comb_196608 0x00082 0 0 1609728 196608 0 0 0 0 0 0 0 0 0
zio_buf_comb_262144 0x00082 0 0 2134016 262144 0 0 0 0 0 0 0 0 0
zio_buf_comb_393216 0x00082 0 0 3182592 393216 0 0 0 0 0 0 0 0 0
zio_buf_comb_524288 0x00082 0 0 4231168 524288 0 0 0 0 0 0 0 0 0
zio_buf_comb_786432 0x00082 0 0 6328320 786432 0 0 0 0 0 0 0 0 0
zio_buf_comb_1048576 0x00082 8425472 2097152 8425472 1048576 1 1 1 8 2 2 0 0 0
zio_buf_comb_1572864 0x00082 0 0 12619776 1572864 0 0 0 0 0 0 0 0 0
zio_buf_comb_2097152 0x00082 0 0 16814080 2097152 0 0 0 0 0 0 0 0 0
zio_buf_comb_3145728 0x00082 0 0 25202688 3145728 0 0 0 0 0 0 0 0 0
zio_buf_comb_4194304 0x00082 0 0 29392896 4194304 0 0 0 0 0 0 0 0 0
zio_buf_comb_6291456 0x00082 0 0 31481856 6291456 0 0 0 0 0 0 0 0 0
zio_buf_comb_8388608 0x00082 0 0 25182208 8388608 0 0 0 0 0 0 0 0 0
zio_buf_comb_12582912 0x00082 0 0 25178112 12582912 0 0 0 0 0 0 0 0 0
zio_buf_comb_16777216 0x00082 0 0 16785408 16777216 0 0 0 0 0 0 0 0 0
lz4_cache 0x200100 - 0 - 16384 - - - - 0 - - - -
abd_t 0x200100 - 201563520 - 96 - - - - 2099620 - - - -
sa_cache 0x200100 - 1280 - 256 - - - - 5 - - - -
dnode_t 0x200100 - 2197216 - 952 - - - - 2308 - - - -
arc_buf_hdr_t_full 0x200100 - 862455360 - 240 - - - - 3593564 - - - -
arc_buf_hdr_t_l2only 0x00100 - 0 - 96 - - - - 0 - - - -
arc_buf_t 0x00100 - 1083168 - 32 - - - - 33849 - - - -
dmu_buf_impl_t 0x00100 - 13400832 - 384 - - - - 34898 - - - -
zil_lwb_cache 0x00100 - 0 - 392 - - - - 0 - - - -
zil_zcw_cache 0x00100 - 0 - 152 - - - - 0 - - - -
sio_cache_0 0x00100 - 0 - 136 - - - - 0 - - - -
sio_cache_1 0x00100 - 0 - 152 - - - - 0 - - - -
sio_cache_2 0x00100 - 0 - 168 - - - - 0 - - - -
zap_name 0x00100 - 0 - 328 - - - - 0 - - - -
zap_attr_cache 0x00100 - 0 - 288 - - - - 0 - - - -
zap_name_long 0x00100 - 0 - 1096 - - - - 0 - - - -
zap_attr_long_cache 0x00100 - 0 - 1056 - - - - 0 - - - -
zfs_znode_cache 0x200100 - 6816 - 1136 - - - - 6 - - - -
zfs_znode_hold_cache 0x00100 - 0 - 88 - - - - 0 - - - -
slab prior to OOM event:
--------------------- cache ------------------------------------------------------- ----- slab ------ ---- object ----- --- emergency ---
name flags size alloc slabsize objsize total alloc max total alloc max dlock alloc max
spl_zlib_workspace_cache 0x200080 0 0 2145216 268104 0 0 0 0 0 0 0 0 0
kcf_context_cache 0x00100 - 0 - 40 - - - - 0 - - - -
zfs_btree_leaf_cache 0x00100 - 10989568 - 4096 - - - - 2683 - - - -
metaslab_alloc_trace_cache 0x00100 - 0 - 72 - - - - 0 - - - -
brt_entry_cache 0x00100 - 0 - 40 - - - - 0 - - - -
brt_pending_entry_cache 0x00100 - 0 - 160 - - - - 0 - - - -
ddt_cache 0x00080 468992 234112 234496 29264 2 1 1 16 8 8 0 0 0
ddt_entry_flat_cache 0x00100 - 269170800 - 240 - - - - 1121545 - - - -
ddt_entry_trad_cache 0x00100 - 0 - 424 - - - - 0 - - - -
ddt_log_entry_flat_cache 0x00100 - 15814759248 - 144 - - - - 109824717 - - - -
ddt_log_entry_trad_cache 0x00100 - 0 - 328 - - - - 0 - - - -
zio_cache 0x00100 - 11068225344 - 1216 - - - - 9102159 - - - -
zio_link_cache 0x00100 - 436901520 - 48 - - - - 9102115 - - - -
zio_buf_comb_512 0x200102 - 9216 - 512 - - - - 18 - - - -
zio_buf_comb_1024 0x200102 - 0 - 1024 - - - - 0 - - - -
zio_buf_comb_1536 0x00102 - 0 - 1536 - - - - 0 - - - -
zio_buf_comb_2048 0x00102 - 2048 - 2048 - - - - 1 - - - -
zio_buf_comb_3072 0x00102 - 3072 - 3072 - - - - 1 - - - -
zio_buf_comb_4096 0x00102 - 28672 - 4096 - - - - 7 - - - -
zio_buf_comb_6144 0x00102 - 0 - 6144 - - - - 0 - - - -
zio_buf_comb_8192 0x00102 - 0 - 8192 - - - - 0 - - - -
zio_buf_comb_12288 0x00102 - 12288 - 12288 - - - - 1 - - - -
zio_buf_comb_16384 0x00102 - 1228800 - 16384 - - - - 75 - - - -
zio_buf_comb_24576 0x00082 11206656 9240576 233472 24576 48 47 48 384 376 384 0 0 0
zio_buf_comb_32768 0x00082 1746505728 1457258496 299008 32768 5841 5841 7976 46728 44472 63808 0 0 0
zio_buf_comb_49152 0x00082 24944640 19759104 430080 49152 58 58 58 464 402 464 0 0 0
zio_buf_comb_65536 0x00082 16273408 15204352 561152 65536 29 29 32 232 232 256 0 0 0
zio_buf_comb_98304 0x00082 0 0 823296 98304 0 0 0 0 0 0 0 0 0
zio_buf_comb_131072 0x00082 3728486400 3198025728 1085440 131072 3435 3435 4975 27480 24399 39800 0 0 0
zio_buf_comb_196608 0x00082 0 0 1609728 196608 0 0 0 0 0 0 0 0 0
zio_buf_comb_262144 0x00082 0 0 2134016 262144 0 0 0 0 0 0 0 0 0
zio_buf_comb_393216 0x00082 0 0 3182592 393216 0 0 0 0 0 0 0 0 0
zio_buf_comb_524288 0x00082 0 0 4231168 524288 0 0 0 0 0 0 0 0 0
zio_buf_comb_786432 0x00082 0 0 6328320 786432 0 0 0 0 0 0 0 0 0
zio_buf_comb_1048576 0x00082 8425472 2097152 8425472 1048576 1 1 1 8 2 2 0 0 0
zio_buf_comb_1572864 0x00082 0 0 12619776 1572864 0 0 0 0 0 0 0 0 0
zio_buf_comb_2097152 0x00082 0 0 16814080 2097152 0 0 0 0 0 0 0 0 0
zio_buf_comb_3145728 0x00082 0 0 25202688 3145728 0 0 0 0 0 0 0 0 0
zio_buf_comb_4194304 0x00082 0 0 29392896 4194304 0 0 0 0 0 0 0 0 0
zio_buf_comb_6291456 0x00082 0 0 31481856 6291456 0 0 0 0 0 0 0 0 0
zio_buf_comb_8388608 0x00082 0 0 25182208 8388608 0 0 0 0 0 0 0 0 0
zio_buf_comb_12582912 0x00082 0 0 25178112 12582912 0 0 0 0 0 0 0 0 0
zio_buf_comb_16777216 0x00082 0 0 16785408 16777216 0 0 0 0 0 0 0 0 0
lz4_cache 0x200100 - 0 - 16384 - - - - 0 - - - -
abd_t 0x200100 - 170297952 - 96 - - - - 1773937 - - - -
sa_cache 0x200100 - 1280 - 256 - - - - 5 - - - -
dnode_t 0x200100 - 2177224 - 952 - - - - 2287 - - - -
arc_buf_hdr_t_full 0x200100 - 718020720 - 240 - - - - 2991753 - - - -
arc_buf_hdr_t_l2only 0x00100 - 0 - 96 - - - - 0 - - - -
arc_buf_t 0x00100 - 2178048 - 32 - - - - 68064 - - - -
dmu_buf_impl_t 0x00100 - 26533248 - 384 - - - - 69097 - - - -
zil_lwb_cache 0x00100 - 0 - 392 - - - - 0 - - - -
zil_zcw_cache 0x00100 - 0 - 152 - - - - 0 - - - -
sio_cache_0 0x00100 - 0 - 136 - - - - 0 - - - -
sio_cache_1 0x00100 - 0 - 152 - - - - 0 - - - -
sio_cache_2 0x00100 - 0 - 168 - - - - 0 - - - -
zap_name 0x00100 - 2296 - 328 - - - - 7 - - - -
zap_attr_cache 0x00100 - 0 - 288 - - - - 0 - - - -
zap_name_long 0x00100 - 0 - 1096 - - - - 0 - - - -
zap_attr_long_cache 0x00100 - 0 - 1056 - - - - 0 - - - -
zfs_znode_cache 0x200100 - 6816 - 1136 - - - - 6 - - - -
zfs_znode_hold_cache 0x00100 - 0 - 88 - - - - 0 - - - -
slab after OOM:
--------------------- cache ------------------------------------------------------- ----- slab ------ ---- object ----- --- emergency ---
name flags size alloc slabsize objsize total alloc max total alloc max dlock alloc max
spl_zlib_workspace_cache 0x200080 0 0 2145216 268104 0 0 0 0 0 0 0 0 0
kcf_context_cache 0x00100 - 0 - 40 - - - - 0 - - - -
zfs_btree_leaf_cache 0x00100 - 0 - 4096 - - - - 0 - - - -
metaslab_alloc_trace_cache 0x00100 - 0 - 72 - - - - 0 - - - -
brt_entry_cache 0x00100 - 0 - 40 - - - - 0 - - - -
brt_pending_entry_cache 0x00100 - 0 - 160 - - - - 0 - - - -
ddt_cache 0x00080 0 0 234496 29264 0 0 0 0 0 0 0 0 0
ddt_entry_flat_cache 0x00100 - 0 - 240 - - - - 0 - - - -
ddt_entry_trad_cache 0x00100 - 0 - 424 - - - - 0 - - - -
ddt_log_entry_flat_cache 0x00100 - 0 - 144 - - - - 0 - - - -
ddt_log_entry_trad_cache 0x00100 - 0 - 328 - - - - 0 - - - -
zio_cache 0x00100 - 0 - 1216 - - - - 0 - - - -
zio_link_cache 0x00100 - 0 - 48 - - - - 0 - - - -
zio_buf_comb_512 0x200102 - 0 - 512 - - - - 0 - - - -
zio_buf_comb_1024 0x200102 - 0 - 1024 - - - - 0 - - - -
zio_buf_comb_1536 0x00102 - 0 - 1536 - - - - 0 - - - -
zio_buf_comb_2048 0x00102 - 0 - 2048 - - - - 0 - - - -
zio_buf_comb_3072 0x00102 - 0 - 3072 - - - - 0 - - - -
zio_buf_comb_4096 0x00102 - 0 - 4096 - - - - 0 - - - -
zio_buf_comb_6144 0x00102 - 0 - 6144 - - - - 0 - - - -
zio_buf_comb_8192 0x00102 - 0 - 8192 - - - - 0 - - - -
zio_buf_comb_12288 0x00102 - 0 - 12288 - - - - 0 - - - -
zio_buf_comb_16384 0x00102 - 0 - 16384 - - - - 0 - - - -
zio_buf_comb_24576 0x00082 0 0 233472 24576 0 0 0 0 0 0 0 0 0
zio_buf_comb_32768 0x00082 0 0 299008 32768 0 0 0 0 0 0 0 0 0
zio_buf_comb_49152 0x00082 0 0 430080 49152 0 0 0 0 0 0 0 0 0
zio_buf_comb_65536 0x00082 0 0 561152 65536 0 0 0 0 0 0 0 0 0
zio_buf_comb_98304 0x00082 0 0 823296 98304 0 0 0 0 0 0 0 0 0
zio_buf_comb_131072 0x00082 1085440 1048576 1085440 131072 1 1 1 8 8 8 0 0 0
zio_buf_comb_196608 0x00082 0 0 1609728 196608 0 0 0 0 0 0 0 0 0
zio_buf_comb_262144 0x00082 0 0 2134016 262144 0 0 0 0 0 0 0 0 0
zio_buf_comb_393216 0x00082 0 0 3182592 393216 0 0 0 0 0 0 0 0 0
zio_buf_comb_524288 0x00082 0 0 4231168 524288 0 0 0 0 0 0 0 0 0
zio_buf_comb_786432 0x00082 0 0 6328320 786432 0 0 0 0 0 0 0 0 0
zio_buf_comb_1048576 0x00082 8425472 2097152 8425472 1048576 1 1 1 8 2 2 0 0 0
zio_buf_comb_1572864 0x00082 0 0 12619776 1572864 0 0 0 0 0 0 0 0 0
zio_buf_comb_2097152 0x00082 0 0 16814080 2097152 0 0 0 0 0 0 0 0 0
zio_buf_comb_3145728 0x00082 0 0 25202688 3145728 0 0 0 0 0 0 0 0 0
zio_buf_comb_4194304 0x00082 0 0 29392896 4194304 0 0 0 0 0 0 0 0 0
zio_buf_comb_6291456 0x00082 0 0 31481856 6291456 0 0 0 0 0 0 0 0 0
zio_buf_comb_8388608 0x00082 0 0 25182208 8388608 0 0 0 0 0 0 0 0 0
zio_buf_comb_12582912 0x00082 0 0 25178112 12582912 0 0 0 0 0 0 0 0 0
zio_buf_comb_16777216 0x00082 0 0 16785408 16777216 0 0 0 0 0 0 0 0 0
lz4_cache 0x200100 - 0 - 16384 - - - - 0 - - - -
abd_t 0x200100 - 96 - 96 - - - - 1 - - - -
sa_cache 0x200100 - 0 - 256 - - - - 0 - - - -
dnode_t 0x200100 - 0 - 952 - - - - 0 - - - -
arc_buf_hdr_t_full 0x200100 - 7680 - 240 - - - - 32 - - - -
arc_buf_hdr_t_l2only 0x00100 - 0 - 96 - - - - 0 - - - -
arc_buf_t 0x00100 - 0 - 32 - - - - 0 - - - -
dmu_buf_impl_t 0x00100 - 0 - 384 - - - - 0 - - - -
zil_lwb_cache 0x00100 - 0 - 392 - - - - 0 - - - -
zil_zcw_cache 0x00100 - 0 - 152 - - - - 0 - - - -
sio_cache_0 0x00100 - 0 - 136 - - - - 0 - - - -
sio_cache_1 0x00100 - 0 - 152 - - - - 0 - - - -
sio_cache_2 0x00100 - 0 - 168 - - - - 0 - - - -
zap_name 0x00100 - 0 - 328 - - - - 0 - - - -
zap_attr_cache 0x00100 - 0 - 288 - - - - 0 - - - -
zap_name_long 0x00100 - 0 - 1096 - - - - 0 - - - -
zap_attr_long_cache 0x00100 - 0 - 1056 - - - - 0 - - - -
zfs_znode_cache 0x200100 - 0 - 1136 - - - - 0 - - - -
zfs_znode_hold_cache 0x00100 - 0 - 88 - - - - 0 - - - -
@jtblck90 thanks for all the info. I've been able to reproduce in the lab, and I have a patch which should help. I'm still completing testing but I should be able to post a PR later today.
If you're able, could you please rerun your test with this patch? Thanks! https://github.com/robn/zfs/commit/52beaf57f3d2bcaad723a6544c6f69b7f3f4fae5
Note that this won't do anything about the Large kmem_alloc
warning (though it might not be seen as often or at all). That's a separate thing and I will deal with it later. Fortunately it's only a warning and I understand where it comes from, so there's nothing to worry about on that one.
@robn Thanks! I will test the patch and get back to you with the results.
@robn I have tested your patch by removing 4x2TB files from a zpool with the same configuration as above. The rm
command took around 10 minutes to complete and it completed successfully.
However, once the actual space reclamation started and used size on zpool started decreasing, I monitored the zpool state with watch zpool status -D
command and noticed that the number of DDT entries started to increase as well as their size on disk and in core. This process slowly consumed all the RAM and the VM went OOM.
I performed the second test, but this time, I decreased the zfs_max_async_dedup_frees
value to 10000 as mentioned in 16708 and this time the number of DDT entries started to decrease, slowly reducing used space without RAM usage increase.
Basically, I just needed to tune the parameter above and everything worked!
By the way, do you have any assumption when zfs 2.3.0 might be released and if your patch will be included in it?
@jtblck90 we'll pull it back in to the 2.3.0 release branch once the PR is finalized and merged to master.
@behlendorf Thank you, that's great news! Perhaps, you have some insight on when we could expect 2.3.0 full release?
System information
A testing VM has 64GB of RAM and 32GB of RAM set for ZFS ARC via min and max parameters. RAIDZ1 pool with is configured with deduplication and fast dedup feature is enabled and active.
zpool status
zpool config
zfs config
zpool status with DDT (zpool status -D)
arcstat
Describe the problem you're observing
When writing large files which in my test was 4 files 2TB each (8TB total) on a zpool with dedup enabled and fast dedup feature active, all ARC is used and total RAM consumption sits at around 47GB. When deleting the files, RAM usage grows and the system goes into OOM. This can be reproduced on other recordsizes as well (tested with 16K and 128K recordsize). Also, with lower amount of data and lower amount of RAM. Same can be observed with lots of small files but with equal total space occupied on zpool. If removing small files one by one, they can be deleted, but when attempting to remove lots of 1GB files simultaneously, this results in OOM. After the reset, zpool cannot be imported resulting in the same OOM condition.
Describe how to reproduce the problem
Write several large files on zpool with deduplication and fast dedup enabled. In my experiment, this was 4x2TB files. Total RAM - 64GB. Or, 4x1TB files but with lower amount of RAM (32GB). Try to remove the files with
rm
Include any warning/errors/backtraces from the system logs
I cannot find the OOM messages after the reset in journal so attaching the screenshot here.
From the journal log, I see the following events:
Oct 29 04:45:17 zfs-rc2-test kernel: Large kmem_alloc(74904, 0x1000), please file an issue at: https://github.com/openzfs/zfs/issues/new
Attaching full journal logs and dmesg logs just in case. log.txt dmesg.txt