openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.54k stars 1.74k forks source link

AMD Piledriver: ZFS doesn't seem to use FPU #9570

Closed wildy closed 4 years ago

wildy commented 4 years ago

System information

Type Version/Name
Distribution Name Debian
Distribution Version buster
Linux Kernel 4.19.0-6-amd64
Architecture amd64
ZFS Version 0.8.2-2~bpo10+1
SPL Version 0.8.2-2~bpo10+1

Describe the problem you're observing

Hi. I've installed Debian Buster onto a FX-8350 machine and can see that ZFS doesn't use the FPU. According to #9415 the problem should exist only with 5.0 and later kernels, but I still see that in my case ZFS doesn't use SSE/AVX. I'm also using an encrypted pool.

I followed this guide initially: https://github.com/zfsonlinux/zfs/wiki/Debian-Buster-Root-on-ZFS

Describe how to reproduce the problem

Install Debian Buster on an AMD Piledriver machine, then install ZFS from backports according to https://github.com/zfsonlinux/zfs/wiki/Debian-Buster-Root-on-ZFS

Include any warning/errors/backtraces from the system logs

# cat /sys/module/zfs/parameters/zfs_vdev_raidz_impl 
[fastest] original scalar
# zfs get all tank
NAME  PROPERTY              VALUE                  SOURCE
tank  type                  filesystem             -
tank  creation              Fri Nov  8 11:57 2019  -
tank  used                  4.49T                  -
tank  available             643G                   -
tank  referenced            4.10G                  -
tank  compressratio         1.03x                  -
tank  mounted               yes                    -
tank  quota                 none                   default
tank  reservation           none                   default
tank  recordsize            128K                   default
tank  mountpoint            /tank                  local
tank  sharenfs              off                    default
tank  checksum              on                     default
tank  compression           lz4                    local
tank  atime                 on                     default
tank  devices               on                     default
tank  exec                  on                     default
tank  setuid                on                     default
tank  readonly              off                    default
tank  zoned                 off                    default
tank  snapdir               hidden                 default
tank  aclinherit            restricted             default
tank  createtxg             1                      -
tank  canmount              on                     default
tank  xattr                 sa                     local
tank  copies                1                      default
tank  version               5                      -
tank  utf8only              on                     -
tank  normalization         formD                  -
tank  casesensitivity       sensitive              -
tank  vscan                 off                    default
tank  nbmand                off                    default
tank  sharesmb              off                    default
tank  refquota              none                   default
tank  refreservation        none                   default
tank  guid                  13720228742997107999   -
tank  primarycache          all                    default
tank  secondarycache        all                    default
tank  usedbysnapshots       0B                     -
tank  usedbydataset         4.10G                  -
tank  usedbychildren        4.48T                  -
tank  usedbyrefreservation  0B                     -
tank  logbias               latency                default
tank  objsetid              54                     -
tank  dedup                 off                    default
tank  mlslabel              none                   default
tank  sync                  standard               default
tank  dnodesize             auto                   local
tank  refcompressratio      1.00x                  -
tank  written               4.10G                  -
tank  logicalused           4.63T                  -
tank  logicalreferenced     4.10G                  -
tank  volmode               default                default
tank  filesystem_limit      none                   default
tank  snapshot_limit        none                   default
tank  filesystem_count      none                   default
tank  snapshot_count        none                   default
tank  snapdev               hidden                 default
tank  acltype               posixacl               local
tank  context               none                   default
tank  fscontext             none                   default
tank  defcontext            none                   default
tank  rootcontext           none                   default
tank  relatime              on                     local
tank  redundant_metadata    all                    default
tank  overlay               off                    default
tank  encryption            aes-256-gcm            -
tank  keylocation           prompt                 local
tank  keyformat             passphrase             -
tank  pbkdf2iters           342K                   -
tank  encryptionroot        tank                   -
tank  keystatus             available              -
tank  special_small_blocks  0                      default
# cat /proc/spl/kstat/zfs/fletcher_4_bench
5 0 0x01 -1 0 3563867496 1274574892174
implementation   native         byteswap       
scalar           4615670632     4392800220     
superscalar      6225899889     4935713799     
superscalar4     6280394968     5292023883     
fastest          superscalar4   superscalar4   
# cat /proc/cpuinfo 
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.336
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 0
cpu cores   : 4
apicid      : 16
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 1
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.316
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 1
cpu cores   : 4
apicid      : 17
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 2
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.358
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 2
cpu cores   : 4
apicid      : 18
initial apicid  : 2
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 3
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.337
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 3
cpu cores   : 4
apicid      : 19
initial apicid  : 3
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 4
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.405
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 4
cpu cores   : 4
apicid      : 20
initial apicid  : 4
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 5
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.287
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 5
cpu cores   : 4
apicid      : 21
initial apicid  : 5
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 6
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.188
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 6
cpu cores   : 4
apicid      : 22
initial apicid  : 6
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor   : 7
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 2
model name  : AMD FX(tm)-8350 Eight-Core Processor
stepping    : 0
microcode   : 0x6000822
cpu MHz     : 1404.252
cache size  : 2048 KB
physical id : 0
siblings    : 8
core id     : 7
cpu cores   : 4
apicid      : 23
initial apicid  : 7
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate ssbd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bugs        : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips    : 8026.61
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
mskarbek commented 4 years ago

This issue is not only limited to the 5.x version of the kernel. It also affects 4.19.38 and newer (also 4.14.120 and newer but that doesn't matter in this case). Debian 4.19.0-6 package is based on 4.19.67 version of the kernel so not much to do about that. You can either rebuild your own kernel with the NixOS patch or use an older package.

wildy commented 4 years ago

I can see here that it apparently was fixed in 0.8.1?

mskarbek commented 4 years ago

I can see here that it apparently was fixed in 0.8.1?

[insert confused/surprised gif here] There is no mention of FPU fix in that news. In the future please read release notes, not random news articles.

behlendorf commented 4 years ago

This will be resolved in 0.8.3, by #9515.