koverstreet / bcachefs

Other
689 stars 72 forks source link

bcachefs doesn't use hardware accelerated crc [bdf6d7c13504] #415

Closed marcin-github closed 11 months ago

marcin-github commented 2 years ago

It looks that computating crc while reading uses only software crc function. I create two big files, one was written with crc selected as data_checksum, second was created with xxhash. Fs is on ssd drive. Benchmarks and output from perf top:

for file with crc checksum

echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; time dd if=first-file.zst of=/dev/null bs=128k
356068+1 przeczytanych rekordów
356068+1 zapisanych rekordów
skopiowane 46670624913 bajtów (47 GB, 43 GiB), 187,957 s, 248 MB/s

real    3m7,962s
user    0m1,225s
sys     1m0,733s

perf:

  37,92%  [kernel]                                             [k] crc32_body
   3,81%  perf                                                 [.] __symbols__insert
   3,75%  libz.so.1.2.11                                       [.] inflate_fast
   3,70%  [kernel]                                             [k] native_flush_tlb_global
   3,07%  [kernel]                                             [k] copy_user_enhanced_fast_string
   1,61%  perf                                                 [.] rb_next                                                                                                                                    1,55%  libz.so.1.2.11                                       [.] crc32_z
   1,54%  [kernel]                                             [k] check_preemption_disabled
   1,09%  [kernel]                                             [k] xas_load                                                                                                                                   1,01%  [kernel]                                             [k] lookup_address_in_pgd
   0,81%  [kernel]                                             [k] lookup_page_ext
   0,67%  [kernel]                                             [k] stackleak_erase
   0,60%  [kernel]                                             [k] psi_group_change
   0,59%  [kernel]                                             [k] __change_page_attr_set_clr
   0,57%  [kernel]                                             [k] cgroup_rstat_updated

and for xxhash:

# echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; time dd if=xxh_file.zst of=/dev/null bs=128k
356068+1 przeczytanych rekordów
356068+1 zapisanych rekordów
skopiowane 46670624913 bajtów (47 GB, 43 GiB), 181,729 s, 257 MB/s

real    3m1,732s
user    0m1,624s
sys     1m21,934s
Samples: 263K of event 'cycles', 4000 Hz, Event count (approx.): 11323287977 lost: 0/0 drop: 0/0
Overhead  Shared Object                                        Symbol
  10,50%  [kernel]                                             [k] xxh64_update
   6,90%  [kernel]                                             [k] native_flush_tlb_global
   5,17%  [kernel]                                             [k] copy_user_enhanced_fast_string
   2,74%  [kernel]                                             [k] check_preemption_disabled
   1,85%  [kernel]                                             [k] xas_load
   1,56%  [kernel]                                             [k] lookup_address_in_pgd                                                                                                                      1,25%  [kernel]                                             [k] lookup_page_ext
   1,06%  [kernel]                                             [k] __change_page_attr_set_clr
   0,96%  [kernel]                                             [k] psi_group_change                                                                                                                           0,90%  [kernel]                                             [k] stackleak_erase
   0,88%  libz.so.1.2.11                                       [.] inflate_fast
   0,88%  [kernel]                                             [k] get_page_from_freelist
   0,86%  [kernel]                                             [k] rmqueue_bulk
   0,86%  [kernel]                                             [k] __page_table_check_zero
   0,80%  [kernel]                                             [k] __pagevec_lru_add

CPU is i5-6500T

list of loaded modules:

# lsmod |sort                                                                                                                                                                          [58/143]
acpi_pad               20480  0
aesni_intel           380928  0
af_packet              53248  2
auth_rpcgss           155648  1 nfsd
backlight              20480  4 video,drm_kms_helper,i915,drm
bfq                    94208  4
binfmt_misc            16384  1
bpfilter               16384  0
bridge                225280  1 br_netfilter
br_netfilter           32768  0
button                 20480  0
cdc_acm                32768  0
cfbcopyarea            16384  1 drm_kms_helper
cfbfillrect            16384  1 drm_kms_helper
cfbimgblt              16384  1 drm_kms_helper
configfs               53248  0
coretemp               16384  0
crc32c_intel           24576  4
crc32_pclmul           16384  0
crct10dif_pclmul       16384  1
cryptd                 28672  2 crypto_simd,ghash_clmulni_intel
crypto_simd            16384  1 aesni_intel
dm_mod                163840  31
drm                   561152  4 drm_kms_helper,i915,ttm
drm_kms_helper        327680  1 i915
drm_panel_orientation_quirks    24576  1 drm
fan                    16384  0
fb                    106496  78 drm_kms_helper,drm
fbdev                  16384  1 fb
fb_sys_fops            16384  1 drm_kms_helper
fixed_phy              16384  1 of_mdio
font                   20480  1 fb
fwnode_mdio            16384  1 of_mdio
ghash_clmulni_intel    16384  0
grace                  16384  2 nfsd,lockd
hwmon                  32768  1 coretemp
i2c_algo_bit           16384  1 i915
i2c_core              102400  6 drm_kms_helper,i2c_algo_bit,i2c_smbus,i2c_i801,i915,drm
i2c_i801               28672  0
i2c_smbus              16384  1 i2c_i801
i915                 3137536  2
idma64                 20480  0
intel_cstate           24576  0
intel_gtt              20480  1 i915
intel_lpss             16384  1 intel_lpss_pci
intel_lpss_pci         28672  0
intel_pch_thermal      16384  0
intel_powerclamp       20480  0
intel_uncore          212992  0
iosf_mbi               20480  1 i915
iptable_filter         16384  1
iptable_nat            16384  2
ip_tables              36864  2 iptable_filter,iptable_nat
irqbypass              16384  1 kvm
kvm                  1064960  1 kvm_intel
kvm_intel             307200  0
libphy                163840  6 r8169,mdio_devres,fwnode_mdio,of_mdio,realtek,fixed_phy                                                                                                             [1/143]
llc                    16384  2 bridge,stp
lockd                 110592  2 nfsd
mdio_devres            16384  1 r8169
md_mod                180224  3 raid1
mei                   118784  1 mei_me
mei_me                 40960  0
mfd_core               16384  1 intel_lpss
Module                  Size  Used by
mpt3sas               348160  2
nf_conntrack          126976  5 xt_conntrack,nf_nat,xt_nat,nf_conntrack_netlink,xt_MASQUERADE
nf_conntrack_netlink    57344  0
nf_defrag_ipv4         16384  1 nf_conntrack
nf_defrag_ipv6         20480  1 nf_conntrack
nf_nat                 45056  3 xt_nat,iptable_nat,xt_MASQUERADE
nfnetlink              20480  2 nf_conntrack_netlink
nfs_acl                16384  1 nfsd
nfsd                  589824  13
of_mdio                20480  1 mdio_devres
overlay               139264  1
r8169                  98304  0
raid1                  49152  1
raid_class             16384  1 mpt3sas
rapl                   20480  0
realtek                28672  1
scsi_transport_sas     49152  1 mpt3sas
snd                   106496  5 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_timer,snd_pcm
snd_hda_codec         159744  2 snd_hda_codec_hdmi,snd_hda_intel
snd_hda_codec_hdmi     69632  1
snd_hda_core          106496  3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
snd_hda_intel          49152  0
snd_intel_dspcfg       16384  1 snd_hda_intel
snd_pcm               135168  4 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_core
snd_timer              40960  1 snd_pcm
soundcore              16384  1 snd
stp                    16384  1 bridge
sunrpc                638976  15 nfsd,auth_rpcgss,lockd,nfs_acl
syscopyarea            16384  1 drm_kms_helper
sysfillrect            16384  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
thermal                20480  0
tiny_power_button      16384  0
ttm                    86016  1 i915
veth                   36864  0
video                  49152  1 i915
virt_dma               16384  1 idma64
x86_pkg_temp_thermal    20480  0
xfrm_algo              16384  1 xfrm_user
xfrm_user              45056  1
xhci_hcd              274432  1 xhci_pci
xhci_pci               20480  0
x_tables               49152  8 xt_conntrack,iptable_filter,xt_tcpudp,xt_addrtype,xt_nat,ip_tables,iptable_nat,xt_MASQUERADE
xt_addrtype            16384  2
xt_conntrack           16384  4
xt_MASQUERADE          16384  8
xt_nat                 16384  8
xt_tcpudp              16384  16
zram                   36864  1
marcin-github commented 2 years ago

I compiled crc32c_intel and crc32_pclmul into kernel and now I can;t see crc32_body in perf. I see crc_128 but it is very low on perf, crc_128 lloks for me that is from crc32c-pcl-intel-asm

# echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; time dd if=2021-11-26-1637885176-fdns_any.json.zst of=/dev/null bs=128k
356068+1 przeczytanych rekordów
356068+1 zapisanych rekordów
skopiowane 46670624913 bajtów (47 GB, 43 GiB), 181,14 s, 258 MB/s

real    3m1,145s
user    0m1,287s
sys     1m23,870s
jpsollie commented 2 years ago

bcachefs doesnt support crypto engine yet, as such all checksums are done in software. This will be a nice enchantement, once it becomes stable enough to handle async crypto