Closed fcrg closed 1 year ago
That really seems like the disk is timing out underneath ZFS and resetting, not necessarily ZFS doing something mad, and I don't know that it has anything to do with the FRAG levels? What models are the disks?
The disks: Samsung SSD 850 EVO 250GB
One issue that is caused by the FRAG levels that's visible with 2.1.1 "zdb -mmm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55" is multiple times slower compared to version 0.8.2 - calling metaslab_load
Do you have autotrim on, perchance? I understand the Samsung 8xx SSDs do not play nice in the sandbox if you issue them TRIMs sometimes, which is why Linux has quirks for them to try and avoid those problems.
4.7 is quite, quite old though, it's not impossible it doesn't know that...
No, that seems to be in 4.7.x and dated to 4.1-rc4. Hm.
There have been weird interactions before with the 8xx EVO drives spitting up DMA errors when you ask them to TRIM on some controllers in some ways, though.
What does zpool get all [pool]
say?
That's curious - I believe 2.1 added things to try and limit how much digging through metaslabs it tries to do for fitting allocations, which should have the opposite effect...I stand by suggesting you stick your nose in perf, maybe with a FlameGraph, and see where it's spending time, possibly on 0.8 versus 2.1.
Autotrim is on.
I will retest the pool with a newer linux version.
zpool get all
zpool get all zfs-235cd8b3-835a-4216-a38b-b52c5c566f55
NAME PROPERTY VALUE SOURCE
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 size 2.70T -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 capacity 57% -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 altroot - default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 health ONLINE -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 guid 4928019990347038273 -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 version - default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 bootfs - default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 delegation on default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 autoreplace off default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 cachefile - default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 failmode continue local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 listsnapshots off default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 autoexpand off default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 dedupratio 1.00x -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 free 1.14T -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 allocated 1.56T -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 readonly off -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 ashift 9 local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 comment - default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 expandsize - -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 freeing 0 -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 fragmentation 78% -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 leaked 0 -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 multihost off default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 checkpoint - -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 load_guid 4585150976694631634 -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 autotrim on local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 compatibility off default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@async_destroy enabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@empty_bpobj active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@lz4_compress active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@multi_vdev_crash_dump disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@spacemap_histogram active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@enabled_txg active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@hole_birth active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@extensible_dataset active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@embedded_data active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@bookmarks enabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@filesystem_limits enabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@large_blocks active local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@large_dnode enabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@sha512 disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@skein disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@edonr disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@userobj_accounting disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@encryption disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@project_quota disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@device_removal disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@obsolete_counts disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@zpool_checkpoint disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@spacemap_v2 disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@allocation_classes disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@resilver_defer disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@bookmark_v2 disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@redaction_bookmarks disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@redacted_datasets disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@bookmark_written disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@log_spacemap disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@livelist disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@device_rebuild disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@zstd_compress disabled local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 feature@draid disabled local
Yes 2.1 should improve the metaslab issue with https://www.illumos.org/issues/11971
The output of perf top
which gives an indication where the CPU spends it's time during zdm -mmm <pool>
42.22% libzpool.so.5.0.0 [.] zfs_btree_verify_counts_helper
15.66% libzpool.so.5.0.0 [.] zfs_btree_verify_height_helper
11.73% libzpool.so.5.0.0 [.] zfs_btree_verify_pointers_helper
0.34% libzpool.so.5.0.0 [.] zfs_btree_find_in_buf.isra.8
0.34% libpthread-2.26.so [.] __pthread_mutex_lock
0.31% libzpool.so.5.0.0 [.] range_tree_seg32_compare
0.28% libzpool.so.5.0.0 [.] range_tree_add_impl
0.24% libruby23.so.2.3.0 [.] 0x000000000008ac37
0.24% libpthread-2.26.so [.] __pthread_mutex_unlock_usercnt
74.43% 0.08% libzpool.so.5.0.0 [.] zfs_btree_verify
|
--13.88%--zfs_btree_verify
|
|--16.76%--zfs_btree_verify_counts_helper
| |
| --42.25%--zfs_btree_verify_counts_helper
| |
| --31.50%--zfs_btree_verify_counts_helper
|
|--13.06%--zfs_btree_verify_pointers_helper
| |
| --12.76%--zfs_btree_verify_pointers_helper
|
--8.98%--zfs_btree_verify_height_helper
|
--18.47%--zfs_btree_verify_height_helper
|
--11.54%--zfs_btree_verify_height_helper
69.52% 0.04% libzpool.so.5.0.0 [.] zfs_btree_add_idx
|
--9.92%--zfs_btree_add_idx
|
--10.20%--zfs_btree_verify
|
|--14.50%--zfs_btree_verify_counts_helper
| |
| --39.38%--zfs_btree_verify_counts_helper
| |
| --29.32%--zfs_btree_verify_counts_helper
|
|--12.15%--zfs_btree_verify_pointers_helper
| |
| --11.86%--zfs_btree_verify_pointers_helper
|
--7.86%--zfs_btree_verify_height_helper
|
--17.14%--zfs_btree_verify_height_helper
|
--10.67%--zfs_btree_verify_height_helper
66.01% 0.06% libzpool.so.5.0.0 [.] space_map_iterate
|
--7.36%--space_map_iterate
|
--7.51%--space_map_load_callback
|
|--5.04%--range_tree_add_impl
| |
| |--3.53%--zfs_btree_add_idx
| | |
| | --3.80%--zfs_btree_verify
| | |
| | |--10.65%--zfs_btree_verify_pointers_helper
| | | |
| | | --10.47%--zfs_btree_verify_pointers_helper
| | |
| | |--9.70%--zfs_btree_verify_counts_helper
| | | |
| | | --33.14%--zfs_btree_verify_counts_helper
| | | |
| | | --24.85%--zfs_btree_verify_counts_helper
| | |
| | --6.34%--zfs_btree_verify_height_helper
| | |
| | --15.27%--zfs_btree_verify_height_helper
| | |
| | --9.67%--zfs_btree_verify_height_helper
| |
| --1.28%--zfs_btree_verify
|
--2.60%--range_tree_remove_impl
|
--2.78%--zfs_btree_verify
|
|--2.05%--zfs_btree_verify_counts_helper
| |
| --2.05%--zfs_btree_verify_counts_helper
| |
| --1.56%--zfs_btree_verify_counts_helper
|
--0.95%--zfs_btree_verify_height_helper
|
--0.95%--zfs_btree_verify_height_helper
65.40% 0.02% libzpool.so.5.0.0 [.] space_map_load_callback
|
--7.50%--space_map_load_callback
|
|--5.04%--range_tree_add_impl
| |
| |--3.53%--zfs_btree_add_idx
| | |
| | --3.79%--zfs_btree_verify
| | |
| | |--10.61%--zfs_btree_verify_pointers_helper
| | | |
| | | --10.43%--zfs_btree_verify_pointers_helper
| | |
| | |--9.66%--zfs_btree_verify_counts_helper
| | | |
| | | --32.99%--zfs_btree_verify_counts_helper
| | | |
| | | --24.75%--zfs_btree_verify_counts_helper
| | |
| | --6.31%--zfs_btree_verify_height_helper
| | |
| | --15.22%--zfs_btree_verify_height_helper
| | |
| | --9.64%--zfs_btree_verify_height_helper
| |
| --1.28%--zfs_btree_verify
|
--2.60%--range_tree_remove_impl
|
--2.74%--zfs_btree_verify
|
|--1.96%--zfs_btree_verify_counts_helper
| |
| --1.95%--zfs_btree_verify_counts_helper
| |
| --1.48%--zfs_btree_verify_counts_helper
|
--0.92%--zfs_btree_verify_height_helper
|
--0.91%--zfs_btree_verify_height_helper
61.77% 0.19% libzpool.so.5.0.0 [.] range_tree_add_impl
|
--4.86%--range_tree_add_impl
|
|--3.53%--zfs_btree_add_idx
| |
| --3.79%--zfs_btree_verify
| |
| |--10.61%--zfs_btree_verify_pointers_helper
| | |
| | --10.43%--zfs_btree_verify_pointers_helper
| |
| |--9.66%--zfs_btree_verify_counts_helper
| | |
| | --32.99%--zfs_btree_verify_counts_helper
| | |
| | --24.75%--zfs_btree_verify_counts_helper
| |
| --6.31%--zfs_btree_verify_height_helper
| |
| --15.22%--zfs_btree_verify_height_helper
| |
| --9.64%--zfs_btree_verify_height_helper
|
--1.28%--zfs_btree_verify
42.84% 42.72% libzpool.so.5.0.0 [.] zfs_btree_verify_counts_helper
|
--8.01%--0xa026258d4c544155
__libc_start_main
main
dump_zpool
dump_metaslab
metaslab_load
|
|--4.83%--range_tree_walk
| metaslab_size_sorted_add
| zfs_btree_add
| zfs_btree_add_idx
| zfs_btree_verify
| zfs_btree_verify_counts_helper
| |
| --6.39%--zfs_btree_verify_counts_helper
| |
| --4.57%--zfs_btree_verify_counts_helper
|
--3.83%--space_map_load_length
space_map_iterate
space_map_load_callback
|
|--2.56%--range_tree_add_impl
| |
| --9.70%--zfs_btree_add_idx
| zfs_btree_verify
| zfs_btree_verify_counts_helper
| |
| --33.13%--zfs_btree_verify_counts_helper
| |
| --24.85%--zfs_btree_verify_counts_helper
|
--1.56%--range_tree_remove_impl
|
--2.05%--zfs_btree_verify
zfs_btree_verify_counts_helper
|
--2.05%--zfs_btree_verify_counts_helper
|
--1.56%--zfs_btree_verify_counts_helper
18.74% 18.69% libzpool.so.5.0.0 [.] zfs_btree_verify_height_helper
|
--3.17%--0xa026258d4c544155
__libc_start_main
main
dump_zpool
dump_metaslab
metaslab_load
|
|--1.99%--space_map_load_length
| space_map_iterate
| space_map_load_callback
| |
| --1.42%--range_tree_add_impl
| |
| --6.34%--zfs_btree_add_idx
| zfs_btree_verify
| zfs_btree_verify_height_helper
| |
| --15.27%--zfs_btree_verify_height_helper
| |
| --9.67%--zfs_btree_verify_height_helper
|
--1.54%--range_tree_walk
metaslab_size_sorted_add
zfs_btree_add
zfs_btree_add_idx
zfs_btree_verify
zfs_btree_verify_height_helper
|
--1.91%--zfs_btree_verify_height_helper
|
--1.02%--zfs_btree_verify_height_helper
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
System information
Describe the problem you're observing
I am using a pool that shows a high fragmentation of the free space.
The pool uses 12 250GByte SSD Disks.
This pool was created with version 0.8.2. With this version this pool shows a low write performance due to high CPU load during metaslab allocation / metaslab_load. All disks are working fine and no write errors are visible with version 0.8.2.
Fragmentation:
sample space map object: "zdb -mmm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55" is multiple times slower compared to version 0.8.2. "zdb -mm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55" is fast.
Using this pool with version 2.1.1 now shows write errors. Other pools are working fine on my setup.
Describe how to reproduce the problem
Using the fio tool to write data
Include any warning/errors/backtraces from the system logs
shows
resulting in