openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.43k stars 1.73k forks source link

the return of "Unaligned write command" errors #10094

Open dd1dd1 opened 4 years ago

dd1dd1 commented 4 years ago

Reporting an unusual situation. Have ZFS mirror array across two 1TB SSDs. It regularly spews "Unaligned write command" errors. From reading reports here and elsewhere, this problem used to exist, was fixed years ago, not supposed to happen today. So, a puzzle.

It turns out that the two SSDs report different physical sector size, one reports 512 bytes, one reports 4096 bytes. Same vendor, same model, same firmware. (WTH?!?)

zpool reports default ashift 0 (autodetect_. zdb reports ashift 12 (correct for 4096 sectors)

So everything seems to be correct, but the errors are there.

The "unaligned write command" errors only some from the "4096 bytes" SSD. After these write errors, "zpool scrub" runs without errors ("repaired 0B"). Two other zfs mirror pools on the same machine run without errors (2x10TB and 2x6TB disks, all report 4096 physical sector size).

K.O.

dd1dd1 commented 4 years ago

Some details. OS is CentOS-7, kernel 3.10.0-1062.12.1.el7.x86_64, zfs-0.8.3-1. Problem disks:

# ./smart-status.perl 
      Disk                  model               serial     temperature  realloc  pending   uncorr  CRC err     RRER
  /dev/sda WDC  WDS100T2B0A-00SM50         195004A00B9C              26        .        ?        ?        .        ?
  /dev/sdg WDC  WDS100T2B0A-00SM50         195008A008F8              26        .        ?        ?        .        ?
# more /sys/class/block/sda/queue/physical_block_size 
512
# more /sys/class/block/sdg/queue/physical_block_size 
4096
# zpool status zssd1tb
  pool: zssd1tb
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:05:55 with 0 errors on Mon Mar  2 13:24:15 2020
config:

        NAME                                         STATE     READ WRITE CKSUM
        zssd1tb                                      ONLINE       0     0     0
          mirror-0                                   ONLINE       0     0     0
            ata-WDC_WDS100T2B0A-00SM50_195004A00B9C  ONLINE       0     0     0
            ata-WDC_WDS100T2B0A-00SM50_195008A008F8  ONLINE       0     0     0

errors: No known data errors

Typical error from "dmesg"

[1974990.399004] ata8.00: exception Emask 0x0 SAct 0x8000000 SErr 0x0 action 0x6 frozen
[1974990.399009] ata8.00: failed command: WRITE FPDMA QUEUED
[1974990.399013] ata8.00: cmd 61/08:d8:e0:27:70/00:00:74:00:00/40 tag 27 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[1974990.399015] ata8.00: status: { DRDY }
[1974990.399018] ata8: hard resetting link
[1974990.707014] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[1974990.710637] ata8.00: configured for UDMA/133
[1974990.710690] sd 7:0:0:0: [sdg] tag#27 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[1974990.710693] sd 7:0:0:0: [sdg] tag#27 Sense Key : Illegal Request [current] [descriptor] 
[1974990.710695] sd 7:0:0:0: [sdg] tag#27 Add. Sense: Unaligned write command
[1974990.710698] sd 7:0:0:0: [sdg] tag#27 CDB: Write(10) 2a 00 74 70 27 e0 00 00 08 00
[1974990.710699] blk_update_request: I/O error, dev sdg, sector 1953507296
[1974990.710703] zio pool=zssd1tb vdev=/dev/disk/by-id/ata-WDC_WDS100T2B0A-00SM50_195008A008F8-part1 error=5 type=2 offset=1000194686976 size=4096 flags=180ac0
[1974990.710708] ata8: EH complete
shodanshok commented 4 years ago

Such errors almost always denote a problem due to a failing SSD/HDD. The system even tried to reset the SATA link, with the write still failing.

dd1dd1 commented 4 years ago

shodashok - failing hdd: no (this is an ssd), failing ssd: unlikely: (a) ssd is brand new, (b) failing devices usually throw read/write i/o errors, not "illegal request - unaligned write command", (c) bum ssd firmware (hello, Intel!) - unlikely, both ssds have the same firmware, only one throws the errors. (d) overheated ssd malfunction: no, ssd is at room temperature, cool to the touch. K.O.

dd1dd1 commented 4 years ago

additional information: moved the two problem ssds to different sata ports, now both report physical sector size 512. (one was reporting 4096). Now let's see if the errors go away. K.O.

shodanshok commented 4 years ago

@dd1dd1 I had an identical problem on a Crucial MX500 SSD. After some days, the SSD controller failed catastrophically (with the SSD no more discovered on BIOS screen). So, I really doubt it is a ZFS problem. The fact that simply swapping SATA port changed the reported sector size is really suspicious.

c0d3z3r0 commented 4 years ago

I'm experiencing the same issues. These errors appeared out of nothing. I highly doubt that all 4 disks die at exact the same time.

[  122.404043] ata6.00: exception Emask 0x10 SAct 0x20080 SErr 0x4050000 action 0xe frozen
[  122.412047] ata6.00: irq_stat 0x00000040, connection status changed
[  122.418332] ata6: SError: { PHYRdyChg CommWake DevExch }
[  122.423651] ata6.00: failed command: READ FPDMA QUEUED
[  122.428792] ata6.00: cmd 60/08:38:d8:99:76/00:00:31:00:00/40 tag 7 ncq dma 4096 in
                        res 50/00:18:30:90:6b/00:00:31:00:00/40 Emask 0x10 (ATA bus error)
[  122.444332] ata6.00: status: { DRDY }
[  122.447995] ata6.00: failed command: READ FPDMA QUEUED
[  122.453136] ata6.00: cmd 60/18:88:30:90:6b/00:00:31:00:00/40 tag 17 ncq dma 12288 in
                        res 50/00:18:30:90:6b/00:00:31:00:00/40 Emask 0x10 (ATA bus error)
[  122.468849] ata6.00: status: { DRDY }
[  122.472515] ata6: hard resetting link
[  123.184748] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  123.216709] ata6.00: configured for UDMA/133
[  123.218739] sd 5:0:0:0: [sdd] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  123.218741] sd 5:0:0:0: [sdd] tag#7 Sense Key : Illegal Request [current] 
[  123.218742] sd 5:0:0:0: [sdd] tag#7 Add. Sense: Unaligned write command
[  123.218744] sd 5:0:0:0: [sdd] tag#7 CDB: Read(10) 28 00 31 76 99 d8 00 00 08 00
[  123.218745] print_req_error: I/O error, dev sdd, sector 829856216 flags 700
[  123.225719] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=424882188288 size=4096 flags=1808a0
[  123.225726] sd 5:0:0:0: [sdd] tag#17 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  123.225728] sd 5:0:0:0: [sdd] tag#17 Sense Key : Illegal Request [current] 
[  123.225729] sd 5:0:0:0: [sdd] tag#17 Add. Sense: Unaligned write command
[  123.225730] sd 5:0:0:0: [sdd] tag#17 CDB: Read(10) 28 00 31 6b 90 30 00 00 18 00
[  123.225731] print_req_error: I/O error, dev sdd, sector 829132848 flags 700
[  123.232717] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=424511823872 size=12288 flags=1808a0
[  123.232719] ata6: EH complete
[  156.953173] ata6.00: exception Emask 0x0 SAct 0x3 SErr 0x40000 action 0x6 frozen
[  156.960574] ata6: SError: { CommWake }
[  156.964329] ata6.00: failed command: READ FPDMA QUEUED
[  156.969472] ata6.00: cmd 60/20:00:98:08:03/00:00:06:00:00/40 tag 0 ncq dma 16384 in
                        res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  156.984493] ata6.00: status: { DRDY }
[  156.988158] ata6.00: failed command: READ FPDMA QUEUED
[  156.993299] ata6.00: cmd 60/40:08:b8:48:03/00:00:06:00:00/40 tag 1 ncq dma 32768 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  157.008319] ata6.00: status: { DRDY }
[  157.011984] ata6: hard resetting link
[  157.327868] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  157.901907] ata6.00: configured for UDMA/133
[  157.902071] sd 5:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  157.902073] sd 5:0:0:0: [sdd] tag#0 Sense Key : Illegal Request [current] 
[  157.902086] sd 5:0:0:0: [sdd] tag#0 Add. Sense: Unaligned write command
[  157.902087] sd 5:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 06 03 08 98 00 00 20 00
[  157.902089] print_req_error: I/O error, dev sdd, sector 100862104 flags 700
[  157.909063] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=51637202944 size=16384 flags=1808a0
[  157.909086] sd 5:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  157.909087] sd 5:0:0:0: [sdd] tag#1 Sense Key : Illegal Request [current] 
[  157.909087] sd 5:0:0:0: [sdd] tag#1 Add. Sense: Unaligned write command
[  157.909088] sd 5:0:0:0: [sdd] tag#1 CDB: Read(10) 28 00 06 03 48 b8 00 00 40 00
[  157.909089] print_req_error: I/O error, dev sdd, sector 100878520 flags 700
[  157.916051] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=51645607936 size=32768 flags=40080ca0
[  157.916053] ata6: EH complete
[  158.945000] ata6.00: exception Emask 0x10 SAct 0x40000001 SErr 0x4040000 action 0xe frozen
[  158.953264] ata6.00: irq_stat 0x00000040, connection status changed
[  158.959531] ata6: SError: { CommWake DevExch }
[  158.963983] ata6.00: failed command: READ FPDMA QUEUED
[  158.969123] ata6.00: cmd 60/18:00:a8:48:00/00:00:16:00:00/40 tag 0 ncq dma 12288 in
                        res 50/00:18:a8:48:00/00:00:16:00:00/40 Emask 0x10 (ATA bus error)
[  158.984749] ata6.00: status: { DRDY }
[  158.988416] ata6.00: failed command: READ FPDMA QUEUED
[  158.993557] ata6.00: cmd 60/18:f0:68:41:00/00:00:16:00:00/40 tag 30 ncq dma 12288 in
                        res 50/00:18:a8:48:00/00:00:16:00:00/40 Emask 0x10 (ATA bus error)
[  159.009267] ata6.00: status: { DRDY }
[  159.012935] ata6: hard resetting link
[  159.729269] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  159.745053] ata6.00: configured for UDMA/133
[  159.746879] sd 5:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  159.746880] sd 5:0:0:0: [sdd] tag#0 Sense Key : Illegal Request [current] 
[  159.746881] sd 5:0:0:0: [sdd] tag#0 Add. Sense: Unaligned write command
[  159.746882] sd 5:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 16 00 48 a8 00 00 18 00
[  159.746884] print_req_error: I/O error, dev sdd, sector 369117352 flags 700
[  159.753859] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=188983889920 size=12288 flags=1808a0
[  159.753882] sd 5:0:0:0: [sdd] tag#30 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  159.753882] sd 5:0:0:0: [sdd] tag#30 Sense Key : Illegal Request [current] 
[  159.753883] sd 5:0:0:0: [sdd] tag#30 Add. Sense: Unaligned write command
[  159.753884] sd 5:0:0:0: [sdd] tag#30 CDB: Read(10) 28 00 16 00 41 68 00 00 18 00
[  159.753885] print_req_error: I/O error, dev sdd, sector 369115496 flags 700
[  159.760845] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=188982939648 size=12288 flags=1808a0
[  159.760848] ata6: EH complete
[  163.273140] ata8.00: exception Emask 0x10 SAct 0x8080 SErr 0x4050000 action 0xe frozen
[  163.281060] ata8.00: irq_stat 0x00000040, connection status changed
[  163.287342] ata8: SError: { PHYRdyChg CommWake DevExch }
[  163.292670] ata8.00: failed command: READ FPDMA QUEUED
[  163.297813] ata8.00: cmd 60/18:38:d8:a9:5c/00:00:61:00:00/40 tag 7 ncq dma 12288 in
                        res 50/00:18:d8:a9:5c/00:00:61:00:00/40 Emask 0x10 (ATA bus error)
[  163.313455] ata8.00: status: { DRDY }
[  163.317141] ata8.00: failed command: READ FPDMA QUEUED
[  163.322284] ata8.00: cmd 60/18:78:30:59:76/00:00:61:00:00/40 tag 15 ncq dma 12288 in
                        res 50/00:18:d8:a9:5c/00:00:61:00:00/40 Emask 0x10 (ATA bus error)
[  163.338001] ata8.00: status: { DRDY }
[  163.341683] ata8: hard resetting link
[  164.053460] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  164.081111] ata8.00: configured for UDMA/133
[  164.081341] sd 7:0:0:0: [sdf] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  164.081343] sd 7:0:0:0: [sdf] tag#7 Sense Key : Illegal Request [current] 
[  164.081344] sd 7:0:0:0: [sdf] tag#7 Add. Sense: Unaligned write command
[  164.081346] sd 7:0:0:0: [sdf] tag#7 CDB: Read(10) 28 00 61 5c a9 d8 00 00 18 00
[  164.081347] print_req_error: I/O error, dev sdf, sector 1633462744 flags 700
[  164.088405] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=836328730624 size=12288 flags=1808a0
[  164.088425] sd 7:0:0:0: [sdf] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  164.088426] sd 7:0:0:0: [sdf] tag#15 Sense Key : Illegal Request [current] 
[  164.088427] sd 7:0:0:0: [sdf] tag#15 Add. Sense: Unaligned write command
[  164.088428] sd 7:0:0:0: [sdf] tag#15 CDB: Read(10) 28 00 61 76 59 30 00 00 18 00
[  164.088428] print_req_error: I/O error, dev sdf, sector 1635146032 flags 700
[  164.095480] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=837190574080 size=12288 flags=1808a0
[  164.095482] ata8: EH complete
[  167.065416] ata7.00: exception Emask 0x10 SAct 0xc00000 SErr 0x4040000 action 0xe frozen
[  167.073513] ata7.00: irq_stat 0x00000040, connection status changed
[  167.079788] ata7: SError: { CommWake DevExch }
[  167.084249] ata7.00: failed command: READ FPDMA QUEUED
[  167.089389] ata7.00: cmd 60/60:b0:d8:d8:0b/00:00:59:00:00/40 tag 22 ncq dma 49152 in
                        res 50/00:60:60:b7:0c/00:00:59:00:00/40 Emask 0x10 (ATA bus error)
[  167.105099] ata7.00: status: { DRDY }
[  167.108764] ata7.00: failed command: READ FPDMA QUEUED
[  167.113907] ata7.00: cmd 60/60:b8:60:b7:0c/00:00:59:00:00/40 tag 23 ncq dma 49152 in
                        res 50/00:60:60:b7:0c/00:00:59:00:00/40 Emask 0x10 (ATA bus error)
[  167.129621] ata7.00: status: { DRDY }
[  167.133285] ata7: hard resetting link
[  167.845466] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  167.855367] ata7.00: configured for UDMA/133
[  167.855548] sd 6:0:0:0: [sde] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  167.855562] sd 6:0:0:0: [sde] tag#22 Sense Key : Illegal Request [current] 
[  167.855563] sd 6:0:0:0: [sde] tag#22 Add. Sense: Unaligned write command
[  167.855565] sd 6:0:0:0: [sde] tag#22 CDB: Read(10) 28 00 59 0b d8 d8 00 00 60 00
[  167.855566] print_req_error: I/O error, dev sde, sector 1493948632 flags 700
[  167.862622] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=764897505280 size=49152 flags=40080ca0
[  167.862628] sd 6:0:0:0: [sde] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  167.862629] sd 6:0:0:0: [sde] tag#23 Sense Key : Illegal Request [current] 
[  167.862645] sd 6:0:0:0: [sde] tag#23 Add. Sense: Unaligned write command
[  167.862645] sd 6:0:0:0: [sde] tag#23 CDB: Read(10) 28 00 59 0c b7 60 00 00 60 00
[  167.862646] print_req_error: I/O error, dev sde, sector 1494005600 flags 700
[  167.869712] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=764926672896 size=49152 flags=40080ca0
[  167.869715] ata7: EH complete
[  195.866293] ata6: limiting SATA link speed to 3.0 Gbps
[  195.866295] ata6.00: exception Emask 0x0 SAct 0x102 SErr 0x50000 action 0x6 frozen
[  195.873874] ata6: SError: { PHYRdyChg CommWake }
[  195.878512] ata6.00: failed command: READ FPDMA QUEUED
[  195.883653] ata6.00: cmd 60/18:08:f0:59:50/00:00:51:00:00/40 tag 1 ncq dma 12288 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  195.898670] ata6.00: status: { DRDY }
[  195.902337] ata6.00: failed command: READ FPDMA QUEUED
[  195.907480] ata6.00: cmd 60/38:40:40:52:14/00:00:51:00:00/40 tag 8 ncq dma 28672 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  195.922496] ata6.00: status: { DRDY }
[  195.926165] ata6: hard resetting link
[  196.240878] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  196.802601] ata6.00: configured for UDMA/133
[  196.804425] ata6.00: device reported invalid CHS sector 0
[  196.804443] sd 5:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  196.804444] sd 5:0:0:0: [sdd] tag#1 Sense Key : Illegal Request [current] 
[  196.804445] sd 5:0:0:0: [sdd] tag#1 Add. Sense: Unaligned write command
[  196.804447] sd 5:0:0:0: [sdd] tag#1 CDB: Read(10) 28 00 51 50 59 f0 00 00 18 00
[  196.804448] print_req_error: I/O error, dev sdd, sector 1364220400 flags 700
[  196.811513] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=698476650496 size=12288 flags=1808a0
[  196.811536] sd 5:0:0:0: [sdd] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  196.811537] sd 5:0:0:0: [sdd] tag#8 Sense Key : Illegal Request [current] 
[  196.811537] sd 5:0:0:0: [sdd] tag#8 Add. Sense: Unaligned write command
[  196.811538] sd 5:0:0:0: [sdd] tag#8 CDB: Read(10) 28 00 51 14 52 40 00 00 38 00
[  196.811539] print_req_error: I/O error, dev sdd, sector 1360286272 flags 700
[  196.818588] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=696462376960 size=28672 flags=40080ca0
[  196.818590] ata6: EH complete
[  200.690450] ata6.00: exception Emask 0x10 SAct 0x100008 SErr 0x4050000 action 0xe frozen
[  200.698545] ata6.00: irq_stat 0x00000040, connection status changed
[  200.704828] ata6: SError: { PHYRdyChg CommWake DevExch }
[  200.710156] ata6.00: failed command: READ FPDMA QUEUED
[  200.715298] ata6.00: cmd 60/18:18:28:70:4e/00:00:59:00:00/40 tag 3 ncq dma 12288 in
                        res 50/00:18:28:70:4e/00:00:59:00:00/40 Emask 0x10 (ATA bus error)
[  200.730928] ata6.00: status: { DRDY }
[  200.734595] ata6.00: failed command: READ FPDMA QUEUED
[  200.739734] ata6.00: cmd 60/18:a0:b8:38:38/00:00:59:00:00/40 tag 20 ncq dma 12288 in
                        res 50/00:18:28:70:4e/00:00:59:00:00/40 Emask 0x10 (ATA bus error)
[  200.755449] ata6.00: status: { DRDY }
[  200.759115] ata6: hard resetting link
[  201.478431] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  201.550589] ata6.00: configured for UDMA/133
[  201.550772] sd 5:0:0:0: [sdd] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  201.550774] sd 5:0:0:0: [sdd] tag#3 Sense Key : Illegal Request [current] 
[  201.550776] sd 5:0:0:0: [sdd] tag#3 Add. Sense: Unaligned write command
[  201.550778] sd 5:0:0:0: [sdd] tag#3 CDB: Read(10) 28 00 59 4e 70 28 00 00 18 00
[  201.550779] print_req_error: I/O error, dev sdd, sector 1498312744 flags 700
[  201.557839] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=767131930624 size=12288 flags=1808a0
[  201.557845] sd 5:0:0:0: [sdd] tag#20 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  201.557846] sd 5:0:0:0: [sdd] tag#20 Sense Key : Illegal Request [current] 
[  201.557847] sd 5:0:0:0: [sdd] tag#20 Add. Sense: Unaligned write command
[  201.557848] sd 5:0:0:0: [sdd] tag#20 CDB: Read(10) 28 00 59 38 38 b8 00 00 18 00
[  201.557848] print_req_error: I/O error, dev sdd, sector 1496856760 flags 700
[  201.564911] zio pool=dpool vdev=/dev/mapper/slot4crypt error=5 type=1 offset=766386466816 size=12288 flags=1808a0
[  201.564914] ata6: EH complete
[  204.694296] ata8.00: exception Emask 0x10 SAct 0x10040 SErr 0x4050000 action 0xe frozen
[  204.702312] ata8.00: irq_stat 0x00000040, connection status changed
[  204.708597] ata8: SError: { PHYRdyChg CommWake DevExch }
[  204.713929] ata8.00: failed command: READ FPDMA QUEUED
[  204.719077] ata8.00: cmd 60/08:30:c0:59:9f/00:00:12:00:00/40 tag 6 ncq dma 4096 in
                        res 50/00:58:08:59:9f/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
[  204.734625] ata8.00: status: { DRDY }
[  204.738288] ata8.00: failed command: READ FPDMA QUEUED
[  204.743445] ata8.00: cmd 60/58:80:08:59:9f/00:00:12:00:00/40 tag 16 ncq dma 45056 in
                        res 50/00:58:08:59:9f/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
[  204.759159] ata8.00: status: { DRDY }
[  204.762828] ata8: hard resetting link
[  205.474571] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  205.512273] ata8.00: configured for UDMA/133
[  205.512576] sd 7:0:0:0: [sdf] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  205.512577] sd 7:0:0:0: [sdf] tag#6 Sense Key : Illegal Request [current] 
[  205.512578] sd 7:0:0:0: [sdf] tag#6 Add. Sense: Unaligned write command
[  205.512580] sd 7:0:0:0: [sdf] tag#6 CDB: Read(10) 28 00 12 9f 59 c0 00 00 08 00
[  205.512581] print_req_error: I/O error, dev sdf, sector 312433088 flags 700
[  205.519550] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=159961546752 size=4096 flags=1808a0
[  205.519574] sd 7:0:0:0: [sdf] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  205.519575] sd 7:0:0:0: [sdf] tag#16 Sense Key : Illegal Request [current] 
[  205.519575] sd 7:0:0:0: [sdf] tag#16 Add. Sense: Unaligned write command
[  205.519576] sd 7:0:0:0: [sdf] tag#16 CDB: Read(10) 28 00 12 9f 59 08 00 00 58 00
[  205.519577] print_req_error: I/O error, dev sdf, sector 312432904 flags 700
[  205.526538] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=159961452544 size=45056 flags=40080ca0
[  205.526541] ata8: EH complete
[  205.638343] ata7.00: exception Emask 0x10 SAct 0x8080 SErr 0x4040000 action 0xe frozen
[  205.646267] ata7.00: irq_stat 0x00000040, connection status changed
[  205.652539] ata7: SError: { CommWake DevExch }
[  205.656991] ata7.00: failed command: READ FPDMA QUEUED
[  205.662150] ata7.00: cmd 60/20:38:e8:b7:00/00:00:12:00:00/40 tag 7 ncq dma 16384 in
                        res 50/00:20:70:82:00/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
[  205.677775] ata7.00: status: { DRDY }
[  205.681441] ata7.00: failed command: READ FPDMA QUEUED
[  205.686594] ata7.00: cmd 60/20:78:70:82:00/00:00:12:00:00/40 tag 15 ncq dma 16384 in
                        res 50/00:20:70:82:00/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
[  205.702310] ata7.00: status: { DRDY }
[  205.705993] ata7: hard resetting link
[  206.418610] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  206.432832] ata7.00: configured for UDMA/133
[  206.433009] sd 6:0:0:0: [sde] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  206.433010] sd 6:0:0:0: [sde] tag#7 Sense Key : Illegal Request [current] 
[  206.433011] sd 6:0:0:0: [sde] tag#7 Add. Sense: Unaligned write command
[  206.433012] sd 6:0:0:0: [sde] tag#7 CDB: Read(10) 28 00 12 00 b7 e8 00 00 20 00
[  206.433013] print_req_error: I/O error, dev sde, sector 302036968 flags 700
[  206.439985] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=154638733312 size=16384 flags=1808a0
[  206.440007] sd 6:0:0:0: [sde] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  206.440008] sd 6:0:0:0: [sde] tag#15 Sense Key : Illegal Request [current] 
[  206.440009] sd 6:0:0:0: [sde] tag#15 Add. Sense: Unaligned write command
[  206.440010] sd 6:0:0:0: [sde] tag#15 CDB: Read(10) 28 00 12 00 82 70 00 00 20 00
[  206.440010] print_req_error: I/O error, dev sde, sector 302023280 flags 700
[  206.446970] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=154631725056 size=16384 flags=1808a0
[  206.446973] ata7: EH complete
[  211.422463] ata8.00: exception Emask 0x10 SAct 0x8000000 SErr 0x4050000 action 0xe frozen
[  211.430641] ata8.00: irq_stat 0x00000040, connection status changed
[  211.436912] ata8: SError: { PHYRdyChg CommWake DevExch }
[  211.442230] ata8.00: failed command: READ FPDMA QUEUED
[  211.447369] ata8.00: cmd 60/08:d8:d0:04:0f/02:00:00:00:00/40 tag 27 ncq dma 266240 in
                        res 50/00:50:78:04:0f/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[  211.463168] ata8.00: status: { DRDY }
[  211.466835] ata8: hard resetting link
[  212.178766] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  212.194763] ata8.00: configured for UDMA/133
[  212.195086] sd 7:0:0:0: [sdf] tag#27 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  212.195088] sd 7:0:0:0: [sdf] tag#27 Sense Key : Illegal Request [current] 
[  212.195090] sd 7:0:0:0: [sdf] tag#27 Add. Sense: Unaligned write command
[  212.195091] sd 7:0:0:0: [sdf] tag#27 CDB: Read(10) 28 00 00 0f 04 d0 00 02 08 00
[  212.195092] print_req_error: I/O error, dev sdf, sector 984272 flags 700
[  212.201805] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=499752960 size=266240 flags=40080cb0
[  212.201813] ata8: EH complete
[  216.434765] ata8: limiting SATA link speed to 3.0 Gbps
[  216.434768] ata8.00: exception Emask 0x10 SAct 0x80 SErr 0x4050000 action 0xe frozen
[  216.442533] ata8.00: irq_stat 0x00000040, connection status changed
[  216.448803] ata8: SError: { PHYRdyChg CommWake DevExch }
[  216.454117] ata8.00: failed command: READ FPDMA QUEUED
[  216.459257] ata8.00: cmd 60/10:38:70:64:0d/02:00:01:00:00/40 tag 7 ncq dma 270336 in
                        res 50/00:58:18:64:0d/00:00:01:00:00/40 Emask 0x10 (ATA bus error)
[  216.474968] ata8.00: status: { DRDY }
[  216.478648] ata8: hard resetting link
[  217.190915] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  217.202859] ata8.00: configured for UDMA/133
[  217.204664] sd 7:0:0:0: [sdf] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  217.204665] sd 7:0:0:0: [sdf] tag#7 Sense Key : Illegal Request [current] 
[  217.204666] sd 7:0:0:0: [sdf] tag#7 Add. Sense: Unaligned write command
[  217.204668] sd 7:0:0:0: [sdf] tag#7 CDB: Read(10) 28 00 01 0d 64 70 00 02 10 00
[  217.204669] print_req_error: I/O error, dev sdf, sector 17654896 flags 700
[  217.211553] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=9035112448 size=270336 flags=40080cb0
[  217.211575] ata8: EH complete
[  226.726877] ata8.00: exception Emask 0x10 SAct 0x200000 SErr 0x4050000 action 0xe frozen
[  226.734988] ata8.00: irq_stat 0x00000040, connection status changed
[  226.741254] ata8: SError: { PHYRdyChg CommWake DevExch }
[  226.746565] ata8.00: failed command: READ FPDMA QUEUED
[  226.751706] ata8.00: cmd 60/58:a8:a8:ee:2e/00:00:01:00:00/40 tag 21 ncq dma 45056 in
                        res 50/00:a8:00:ec:2e/00:02:01:00:00/40 Emask 0x10 (ATA bus error)
[  226.767420] ata8.00: status: { DRDY }
[  226.771086] ata8: hard resetting link
[  227.483183] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  227.499067] ata8.00: configured for UDMA/133
[  227.499206] sd 7:0:0:0: [sdf] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  227.499207] sd 7:0:0:0: [sdf] tag#21 Sense Key : Illegal Request [current] 
[  227.499208] sd 7:0:0:0: [sdf] tag#21 Add. Sense: Unaligned write command
[  227.499210] sd 7:0:0:0: [sdf] tag#21 CDB: Read(10) 28 00 01 2e ee a8 00 00 58 00
[  227.499211] print_req_error: I/O error, dev sdf, sector 19852968 flags 700
[  227.506098] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=10160525312 size=45056 flags=1808b0
[  227.506105] ata8: EH complete
[  230.718976] ata5.00: exception Emask 0x10 SAct 0x1000000 SErr 0x4050000 action 0xe frozen
[  230.727173] ata5.00: irq_stat 0x00000040, connection status changed
[  230.733451] ata5: SError: { PHYRdyChg CommWake DevExch }
[  230.738768] ata5.00: failed command: READ FPDMA QUEUED
[  230.743908] ata5.00: cmd 60/78:c0:c8:0b:3c/04:00:01:00:00/40 tag 24 ncq dma 585728 in
                        res 50/00:58:70:0b:3c/00:00:01:00:00/40 Emask 0x10 (ATA bus error)
[  230.759713] ata5.00: status: { DRDY }
[  230.763382] ata5: hard resetting link
[  231.475281] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  231.489245] ata5.00: configured for UDMA/133
[  231.489393] sd 4:0:0:0: [sdc] tag#24 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  231.489394] sd 4:0:0:0: [sdc] tag#24 Sense Key : Illegal Request [current] 
[  231.489395] sd 4:0:0:0: [sdc] tag#24 Add. Sense: Unaligned write command
[  231.489396] sd 4:0:0:0: [sdc] tag#24 CDB: Read(10) 28 00 01 3c 0b c8 00 04 78 00
[  231.489397] print_req_error: I/O error, dev sdc, sector 20712392 flags 700
[  231.496280] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=10600550400 size=585728 flags=40080cb0
[  231.496287] ata5: EH complete
[  251.575584] ata8.00: exception Emask 0x10 SAct 0x180 SErr 0x4050000 action 0xe frozen
[  251.583417] ata8.00: irq_stat 0x00000040, connection status changed
[  251.589690] ata8: SError: { PHYRdyChg CommWake DevExch }
[  251.595005] ata8.00: failed command: READ FPDMA QUEUED
[  251.600148] ata8.00: cmd 60/50:38:60:8a:96/00:00:01:00:00/40 tag 7 ncq dma 40960 in
                        res 50/00:08:50:88:96/00:02:01:00:00/40 Emask 0x10 (ATA bus error)
[  251.615773] ata8.00: status: { DRDY }
[  251.619440] ata8.00: failed command: READ FPDMA QUEUED
[  251.624580] ata8.00: cmd 60/50:40:b8:8a:96/00:00:01:00:00/40 tag 8 ncq dma 40960 in
                        res 50/00:08:50:88:96/00:02:01:00:00/40 Emask 0x10 (ATA bus error)
[  251.640206] ata8.00: status: { DRDY }
[  251.643872] ata8: hard resetting link
[  252.355771] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  252.368058] ata8.00: configured for UDMA/133
[  252.368195] sd 7:0:0:0: [sdf] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  252.368196] sd 7:0:0:0: [sdf] tag#7 Sense Key : Illegal Request [current] 
[  252.368198] sd 7:0:0:0: [sdf] tag#7 Add. Sense: Unaligned write command
[  252.368199] sd 7:0:0:0: [sdf] tag#7 CDB: Read(10) 28 00 01 96 8a 60 00 00 50 00
[  252.368200] print_req_error: I/O error, dev sdf, sector 26643040 flags 700
[  252.375087] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=13637042176 size=40960 flags=1808b0
[  252.375109] sd 7:0:0:0: [sdf] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  252.375110] sd 7:0:0:0: [sdf] tag#8 Sense Key : Illegal Request [current] 
[  252.375111] sd 7:0:0:0: [sdf] tag#8 Add. Sense: Unaligned write command
[  252.375112] sd 7:0:0:0: [sdf] tag#8 CDB: Read(10) 28 00 01 96 8a b8 00 00 50 00
[  252.375113] print_req_error: I/O error, dev sdf, sector 26643128 flags 700
[  252.381989] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=13637087232 size=40960 flags=1808b0
[  252.381991] ata8: EH complete
[  258.843715] ata8.00: exception Emask 0x10 SAct 0x80010 SErr 0x4050000 action 0xe frozen
[  258.851730] ata8.00: irq_stat 0x00000040, connection status changed
[  258.858044] ata8: SError: { PHYRdyChg CommWake DevExch }
[  258.863366] ata8.00: failed command: READ FPDMA QUEUED
[  258.868506] ata8.00: cmd 60/10:20:48:0c:ac/02:00:01:00:00/40 tag 4 ncq dma 270336 in
                        res 50/00:58:f0:0b:ac/00:00:01:00:00/40 Emask 0x10 (ATA bus error)
[  258.884237] ata8.00: status: { DRDY }
[  258.887900] ata8.00: failed command: READ FPDMA QUEUED
[  258.893043] ata8.00: cmd 60/58:98:58:0e:ac/00:00:01:00:00/40 tag 19 ncq dma 45056 in
                        res 50/00:58:f0:0b:ac/00:00:01:00:00/40 Emask 0x10 (ATA bus error)
[  258.908756] ata8.00: status: { DRDY }
[  258.912423] ata8: hard resetting link
[  259.623929] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  259.675597] ata8.00: configured for UDMA/133
[  259.675772] sd 7:0:0:0: [sdf] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  259.675774] sd 7:0:0:0: [sdf] tag#4 Sense Key : Illegal Request [current] 
[  259.675775] sd 7:0:0:0: [sdf] tag#4 Add. Sense: Unaligned write command
[  259.675776] sd 7:0:0:0: [sdf] tag#4 CDB: Read(10) 28 00 01 ac 0c 48 00 02 10 00
[  259.675777] print_req_error: I/O error, dev sdf, sector 28052552 flags 700
[  259.682665] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=14358712320 size=270336 flags=40080cb0
[  259.682688] sd 7:0:0:0: [sdf] tag#19 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  259.682689] sd 7:0:0:0: [sdf] tag#19 Sense Key : Illegal Request [current] 
[  259.682690] sd 7:0:0:0: [sdf] tag#19 Add. Sense: Unaligned write command
[  259.682691] sd 7:0:0:0: [sdf] tag#19 CDB: Read(10) 28 00 01 ac 0e 58 00 00 58 00
[  259.682691] print_req_error: I/O error, dev sdf, sector 28053080 flags 700
[  259.689566] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=14358982656 size=45056 flags=1808b0
[  259.689569] ata8: EH complete
[  277.468146] ata8: limiting SATA link speed to 1.5 Gbps
[  277.468148] ata8.00: exception Emask 0x10 SAct 0x9000 SErr 0x4050000 action 0xe frozen
[  277.476093] ata8.00: irq_stat 0x00000040, connection status changed
[  277.482372] ata8: SError: { PHYRdyChg CommWake DevExch }
[  277.487691] ata8.00: failed command: READ FPDMA QUEUED
[  277.492835] ata8.00: cmd 60/58:60:00:cb:31/00:00:03:00:00/40 tag 12 ncq dma 45056 in
                        res 50/00:68:98:c8:31/00:02:03:00:00/40 Emask 0x10 (ATA bus error)
[  277.508555] ata8.00: status: { DRDY }
[  277.512229] ata8.00: failed command: READ FPDMA QUEUED
[  277.517366] ata8.00: cmd 60/58:78:58:cb:31/00:00:03:00:00/40 tag 15 ncq dma 45056 in
                        res 50/00:68:98:c8:31/00:02:03:00:00/40 Emask 0x10 (ATA bus error)
[  277.533080] ata8.00: status: { DRDY }
[  277.536747] ata8: hard resetting link
[  278.248328] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[  278.272289] ata8.00: configured for UDMA/133
[  278.272459] sd 7:0:0:0: [sdf] tag#12 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  278.272460] sd 7:0:0:0: [sdf] tag#12 Sense Key : Illegal Request [current] 
[  278.272461] sd 7:0:0:0: [sdf] tag#12 Add. Sense: Unaligned write command
[  278.272462] sd 7:0:0:0: [sdf] tag#12 CDB: Read(10) 28 00 03 31 cb 00 00 00 58 00
[  278.272463] print_req_error: I/O error, dev sdf, sector 53594880 flags 700
[  278.279348] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=27436384256 size=45056 flags=1808b0
[  278.279370] sd 7:0:0:0: [sdf] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  278.279371] sd 7:0:0:0: [sdf] tag#15 Sense Key : Illegal Request [current] 
[  278.279372] sd 7:0:0:0: [sdf] tag#15 Add. Sense: Unaligned write command
[  278.279373] sd 7:0:0:0: [sdf] tag#15 CDB: Read(10) 28 00 03 31 cb 58 00 00 58 00
[  278.279374] print_req_error: I/O error, dev sdf, sector 53594968 flags 700
[  278.286251] zio pool=dpool vdev=/dev/mapper/slot6crypt error=5 type=1 offset=27436429312 size=45056 flags=1808b0
[  278.286254] ata8: EH complete
[  363.007111] ata7.00: exception Emask 0x10 SAct 0x80000008 SErr 0x4050000 action 0xe frozen
[  363.015384] ata7.00: irq_stat 0x00000040, connection status changed
[  363.021655] ata7: SError: { PHYRdyChg CommWake DevExch }
[  363.026980] ata7.00: failed command: READ FPDMA QUEUED
[  363.032139] ata7.00: cmd 60/58:18:08:8a:03/00:00:05:00:00/40 tag 3 ncq dma 45056 in
                        res 50/00:58:58:89:03/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
[  363.047771] ata7.00: status: { DRDY }
[  363.051436] ata7.00: failed command: READ FPDMA QUEUED
[  363.056577] ata7.00: cmd 60/58:f8:b0:89:03/00:00:05:00:00/40 tag 31 ncq dma 45056 in
                        res 50/00:58:58:89:03/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
[  363.072289] ata7.00: status: { DRDY }
[  363.075955] ata7: hard resetting link
[  363.787376] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  363.801367] ata7.00: configured for UDMA/133
[  363.803215] sd 6:0:0:0: [sde] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  363.803217] sd 6:0:0:0: [sde] tag#3 Sense Key : Illegal Request [current] 
[  363.803218] sd 6:0:0:0: [sde] tag#3 Add. Sense: Unaligned write command
[  363.803219] sd 6:0:0:0: [sde] tag#3 CDB: Read(10) 28 00 05 03 8a 08 00 00 58 00
[  363.803221] print_req_error: I/O error, dev sde, sector 84118024 flags 700
[  363.810101] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=43064233984 size=45056 flags=1808b0
[  363.810124] sd 6:0:0:0: [sde] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  363.810125] sd 6:0:0:0: [sde] tag#31 Sense Key : Illegal Request [current] 
[  363.810126] sd 6:0:0:0: [sde] tag#31 Add. Sense: Unaligned write command
[  363.810127] sd 6:0:0:0: [sde] tag#31 CDB: Read(10) 28 00 05 03 89 b0 00 00 58 00
[  363.810128] print_req_error: I/O error, dev sde, sector 84117936 flags 700
[  363.817005] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=43064188928 size=45056 flags=1808b0
[  363.817008] ata7: EH complete
[  376.867685] ata5.00: exception Emask 0x10 SAct 0x10 SErr 0x4050000 action 0xe frozen
[  376.875448] ata5.00: irq_stat 0x00000040, connection status changed
[  376.881722] ata5: SError: { PHYRdyChg CommWake DevExch }
[  376.887060] ata5.00: failed command: READ FPDMA QUEUED
[  376.892203] ata5.00: cmd 60/20:20:18:26:37/04:00:05:00:00/40 tag 4 ncq dma 540672 in
                        res 50/00:58:c0:25:37/00:00:05:00:00/40 Emask 0x10 (ATA bus error)
[  376.907912] ata5.00: status: { DRDY }
[  376.911581] ata5: hard resetting link
[  377.623975] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  377.640004] ata5.00: configured for UDMA/133
[  377.640167] sd 4:0:0:0: [sdc] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  377.640168] sd 4:0:0:0: [sdc] tag#4 Sense Key : Illegal Request [current] 
[  377.640182] sd 4:0:0:0: [sdc] tag#4 Add. Sense: Unaligned write command
[  377.640183] sd 4:0:0:0: [sdc] tag#4 CDB: Read(10) 28 00 05 37 26 18 00 04 20 00
[  377.640184] print_req_error: I/O error, dev sdc, sector 87500312 flags 700
[  377.647072] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=44795965440 size=540672 flags=40080cb0
[  377.647094] ata5: EH complete
[  385.600157] ata7: limiting SATA link speed to 3.0 Gbps
[  385.600159] ata7.00: exception Emask 0x10 SAct 0x600000 SErr 0x4050000 action 0xe frozen
[  385.608254] ata7.00: irq_stat 0x00000040, connection status changed
[  385.614532] ata7: SError: { PHYRdyChg CommWake DevExch }
[  385.619844] ata7.00: failed command: READ FPDMA QUEUED
[  385.624985] ata7.00: cmd 60/58:a8:d8:91:55/00:00:05:00:00/40 tag 21 ncq dma 45056 in
                        res 50/00:08:d0:90:55/00:01:05:00:00/40 Emask 0x10 (ATA bus error)
[  385.640696] ata7.00: status: { DRDY }
[  385.644362] ata7.00: failed command: READ FPDMA QUEUED
[  385.649505] ata7.00: cmd 60/58:b0:30:92:55/00:00:05:00:00/40 tag 22 ncq dma 45056 in
                        res 50/00:08:d0:90:55/00:01:05:00:00/40 Emask 0x10 (ATA bus error)
[  385.665216] ata7.00: status: { DRDY }
[  385.668884] ata7: hard resetting link
[  386.380318] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  386.400326] ata7.00: configured for UDMA/133
[  386.402134] sd 6:0:0:0: [sde] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  386.402135] sd 6:0:0:0: [sde] tag#21 Sense Key : Illegal Request [current] 
[  386.402136] sd 6:0:0:0: [sde] tag#21 Add. Sense: Unaligned write command
[  386.402138] sd 6:0:0:0: [sde] tag#21 CDB: Read(10) 28 00 05 55 91 d8 00 00 58 00
[  386.402139] print_req_error: I/O error, dev sde, sector 89493976 flags 700
[  386.409028] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=45816721408 size=45056 flags=1808b0
[  386.409050] sd 6:0:0:0: [sde] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  386.409051] sd 6:0:0:0: [sde] tag#22 Sense Key : Illegal Request [current] 
[  386.409052] sd 6:0:0:0: [sde] tag#22 Add. Sense: Unaligned write command
[  386.409053] sd 6:0:0:0: [sde] tag#22 CDB: Read(10) 28 00 05 55 92 30 00 00 58 00
[  386.409053] print_req_error: I/O error, dev sde, sector 89494064 flags 700
[  386.415929] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=45816766464 size=45056 flags=1808b0
[  386.415931] ata7: EH complete
[  395.888379] ata7.00: exception Emask 0x10 SAct 0x80000000 SErr 0x4050000 action 0xe frozen
[  395.896668] ata7.00: irq_stat 0x00000040, connection status changed
[  395.902945] ata7: SError: { PHYRdyChg CommWake DevExch }
[  395.908288] ata7.00: failed command: READ FPDMA QUEUED
[  395.913442] ata7.00: cmd 60/58:f8:50:f9:79/00:00:05:00:00/40 tag 31 ncq dma 45056 in
                        res 50/00:20:30:f5:79/00:04:05:00:00/40 Emask 0x10 (ATA bus error)
[  395.929154] ata7.00: status: { DRDY }
[  395.932820] ata7: hard resetting link
[  396.644713] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  396.660645] ata7.00: configured for UDMA/133
[  396.662498] sd 6:0:0:0: [sde] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  396.662500] sd 6:0:0:0: [sde] tag#31 Sense Key : Illegal Request [current] 
[  396.662501] sd 6:0:0:0: [sde] tag#31 Add. Sense: Unaligned write command
[  396.662502] sd 6:0:0:0: [sde] tag#31 CDB: Read(10) 28 00 05 79 f9 50 00 00 58 00
[  396.662503] print_req_error: I/O error, dev sde, sector 91879760 flags 700
[  396.669386] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=47038242816 size=45056 flags=1808b0
[  396.669393] ata7: EH complete
[  411.937054] ata7.00: exception Emask 0x10 SAct 0x80000 SErr 0x4050000 action 0xe frozen
[  411.945076] ata7.00: irq_stat 0x00000040, connection status changed
[  411.951367] ata7: SError: { PHYRdyChg CommWake DevExch }
[  411.956687] ata7.00: failed command: READ FPDMA QUEUED
[  411.961825] ata7.00: cmd 60/58:98:f8:7b:b5/00:00:05:00:00/40 tag 19 ncq dma 45056 in
                        res 50/00:10:e8:79:b5/00:02:05:00:00/40 Emask 0x10 (ATA bus error)
[  411.977555] ata7.00: status: { DRDY }
[  411.981223] ata7: hard resetting link
[  412.696988] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  412.705369] ata7.00: configured for UDMA/133
[  412.705537] sd 6:0:0:0: [sde] tag#19 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  412.705539] sd 6:0:0:0: [sde] tag#19 Sense Key : Illegal Request [current] 
[  412.705540] sd 6:0:0:0: [sde] tag#19 Add. Sense: Unaligned write command
[  412.705541] sd 6:0:0:0: [sde] tag#19 CDB: Read(10) 28 00 05 b5 7b f8 00 00 58 00
[  412.705542] print_req_error: I/O error, dev sde, sector 95779832 flags 700
[  412.712437] zio pool=dpool vdev=/dev/mapper/slot5crypt error=5 type=1 offset=49035079680 size=45056 flags=1808b0
[  412.712443] ata7: EH complete
[  481.999222] ata5.00: exception Emask 0x10 SAct 0x2000000 SErr 0x4040000 action 0xe frozen
[  482.007400] ata5.00: irq_stat 0x00000040, connection status changed
[  482.013671] ata5: SError: { CommWake DevExch }
[  482.018117] ata5.00: failed command: READ FPDMA QUEUED
[  482.023257] ata5.00: cmd 60/d0:c8:30:5b:4c/04:00:07:00:00/40 tag 25 ncq dma 630784 in
                        res 50/00:58:d8:5a:4c/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[  482.039054] ata5.00: status: { DRDY }
[  482.042722] ata5: hard resetting link
[  482.755401] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  482.769473] ata5.00: configured for UDMA/133
[  482.771233] sd 4:0:0:0: [sdc] tag#25 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  482.771234] sd 4:0:0:0: [sdc] tag#25 Sense Key : Illegal Request [current] 
[  482.771235] sd 4:0:0:0: [sdc] tag#25 Add. Sense: Unaligned write command
[  482.771236] sd 4:0:0:0: [sdc] tag#25 CDB: Read(10) 28 00 07 4c 5b 30 00 04 d0 00
[  482.771238] print_req_error: I/O error, dev sdc, sector 122444592 flags 700
[  482.778209] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=62687436800 size=630784 flags=40080cb0
[  482.778216] ata5: EH complete
[  498.571587] ata5: limiting SATA link speed to 3.0 Gbps
[  498.571589] ata5.00: exception Emask 0x10 SAct 0x20 SErr 0x4050000 action 0xe frozen
[  498.579338] ata5.00: irq_stat 0x00000040, connection status changed
[  498.585625] ata5: SError: { PHYRdyChg CommWake DevExch }
[  498.590935] ata5.00: failed command: READ FPDMA QUEUED
[  498.596075] ata5.00: cmd 60/28:28:80:80:8a/05:00:07:00:00/40 tag 5 ncq dma 675840 in
                        res 50/00:58:28:80:8a/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[  498.611787] ata5.00: status: { DRDY }
[  498.615454] ata5: hard resetting link
[  499.327851] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  499.343276] ata5.00: configured for UDMA/133
[  499.343443] sd 4:0:0:0: [sdc] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  499.343444] sd 4:0:0:0: [sdc] tag#5 Sense Key : Illegal Request [current] 
[  499.343445] sd 4:0:0:0: [sdc] tag#5 Add. Sense: Unaligned write command
[  499.343446] sd 4:0:0:0: [sdc] tag#5 CDB: Read(10) 28 00 07 8a 80 80 00 05 28 00
[  499.343448] print_req_error: I/O error, dev sdc, sector 126517376 flags 700
[  499.350438] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=64772702208 size=675840 flags=40080cb0
[  499.350460] ata5: EH complete
[  505.699719] ata5.00: exception Emask 0x10 SAct 0x90000000 SErr 0x4050000 action 0xe frozen
[  505.707993] ata5.00: irq_stat 0x00000040, connection status changed
[  505.714259] ata5: SError: { PHYRdyChg CommWake DevExch }
[  505.719569] ata5.00: failed command: READ FPDMA QUEUED
[  505.724713] ata5.00: cmd 60/40:e0:50:2f:a2/05:00:07:00:00/40 tag 28 ncq dma 688128 in
                        res 50/00:58:f8:2e:a2/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[  505.740518] ata5.00: status: { DRDY }
[  505.744183] ata5.00: failed command: READ FPDMA QUEUED
[  505.749326] ata5.00: cmd 60/48:f8:90:34:a2/01:00:07:00:00/40 tag 31 ncq dma 167936 in
                        res 50/00:58:f8:2e:a2/00:00:07:00:00/40 Emask 0x10 (ATA bus error)
[  505.765126] ata5.00: status: { DRDY }
[  505.768793] ata5: hard resetting link
[  506.480042] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  506.494089] ata5.00: configured for UDMA/133
[  506.494268] sd 4:0:0:0: [sdc] tag#28 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  506.494269] sd 4:0:0:0: [sdc] tag#28 Sense Key : Illegal Request [current] 
[  506.494270] sd 4:0:0:0: [sdc] tag#28 Add. Sense: Unaligned write command
[  506.494272] sd 4:0:0:0: [sdc] tag#28 CDB: Read(10) 28 00 07 a2 2f 50 00 05 40 00
[  506.494273] print_req_error: I/O error, dev sdc, sector 128069456 flags 4700
[  506.501348] sd 4:0:0:0: [sdc] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  506.501348] sd 4:0:0:0: [sdc] tag#31 Sense Key : Illegal Request [current] 
[  506.501349] sd 4:0:0:0: [sdc] tag#31 Add. Sense: Unaligned write command
[  506.501365] sd 4:0:0:0: [sdc] tag#31 CDB: Read(10) 28 00 07 a2 34 90 00 01 48 00
[  506.501365] print_req_error: I/O error, dev sdc, sector 128070800 flags 700
[  506.508331] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=65567367168 size=856064 flags=40080cb0
[  506.508334] ata5: EH complete
[  521.132244] ata5.00: exception Emask 0x10 SAct 0x30 SErr 0x4050000 action 0xe frozen
[  521.140004] ata5.00: irq_stat 0x00000040, connection status changed
[  521.146280] ata5: SError: { PHYRdyChg CommWake DevExch }
[  521.151598] ata5.00: failed command: READ FPDMA QUEUED
[  521.156738] ata5.00: cmd 60/58:20:98:44:1b/00:00:08:00:00/40 tag 4 ncq dma 45056 in
                        res 50/00:b8:e0:42:1b/00:01:08:00:00/40 Emask 0x10 (ATA bus error)
[  521.172366] ata5.00: status: { DRDY }
[  521.176027] ata5.00: failed command: READ FPDMA QUEUED
[  521.181170] ata5.00: cmd 60/58:28:f0:44:1b/00:00:08:00:00/40 tag 5 ncq dma 45056 in
                        res 50/00:b8:e0:42:1b/00:01:08:00:00/40 Emask 0x10 (ATA bus error)
[  521.196795] ata5.00: status: { DRDY }
[  521.200465] ata5: hard resetting link
[  521.912451] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  521.926523] ata5.00: configured for UDMA/133
[  521.926688] sd 4:0:0:0: [sdc] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  521.926690] sd 4:0:0:0: [sdc] tag#4 Sense Key : Illegal Request [current] 
[  521.926691] sd 4:0:0:0: [sdc] tag#4 Add. Sense: Unaligned write command
[  521.926692] sd 4:0:0:0: [sdc] tag#4 CDB: Read(10) 28 00 08 1b 44 98 00 00 58 00
[  521.926693] print_req_error: I/O error, dev sdc, sector 136004760 flags 700
[  521.933674] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=69630242816 size=45056 flags=1808b0
[  521.933696] sd 4:0:0:0: [sdc] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  521.933697] sd 4:0:0:0: [sdc] tag#5 Sense Key : Illegal Request [current] 
[  521.933698] sd 4:0:0:0: [sdc] tag#5 Add. Sense: Unaligned write command
[  521.933699] sd 4:0:0:0: [sdc] tag#5 CDB: Read(10) 28 00 08 1b 44 f0 00 00 58 00
[  521.933699] print_req_error: I/O error, dev sdc, sector 136004848 flags 700
[  521.940658] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=69630287872 size=45056 flags=1808b0
[  521.940661] ata5: EH complete
[  529.968429] ata5.00: exception Emask 0x10 SAct 0x28020 SErr 0x4050000 action 0xe frozen
[  529.976441] ata5.00: irq_stat 0x00000040, connection status changed
[  529.982713] ata5: SError: { PHYRdyChg CommWake DevExch }
[  529.988045] ata5.00: failed command: READ FPDMA QUEUED
[  529.993187] ata5.00: cmd 60/58:28:68:11:3b/00:00:08:00:00/40 tag 5 ncq dma 45056 in
                        res 50/00:58:c8:0b:3b/00:00:08:00:00/40 Emask 0x10 (ATA bus error)
[  530.008812] ata5.00: status: { DRDY }
[  530.012477] ata5.00: failed command: READ FPDMA QUEUED
[  530.017618] ata5.00: cmd 60/08:78:20:0c:3b/01:00:08:00:00/40 tag 15 ncq dma 135168 in
                        res 50/00:58:c8:0b:3b/00:00:08:00:00/40 Emask 0x10 (ATA bus error)
[  530.033415] ata5.00: status: { DRDY }
[  530.037082] ata5.00: failed command: READ FPDMA QUEUED
[  530.042225] ata5.00: cmd 60/c8:88:a0:0d:3b/03:00:08:00:00/40 tag 17 ncq dma 495616 in
                        res 50/00:58:c8:0b:3b/00:00:08:00:00/40 Emask 0x10 (ATA bus error)
[  530.058024] ata5.00: status: { DRDY }
[  530.061691] ata5: hard resetting link
[  530.772687] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[  530.786899] ata5.00: configured for UDMA/133
[  530.787038] sd 4:0:0:0: [sdc] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  530.787039] sd 4:0:0:0: [sdc] tag#5 Sense Key : Illegal Request [current] 
[  530.787040] sd 4:0:0:0: [sdc] tag#5 Add. Sense: Unaligned write command
[  530.787042] sd 4:0:0:0: [sdc] tag#5 CDB: Read(10) 28 00 08 3b 11 68 00 00 58 00
[  530.787043] print_req_error: I/O error, dev sdc, sector 138088808 flags 700
[  530.794014] zio pool=dpool vdev=/dev/mapper/slot3crypt error=5 type=1 offset=70697275392 size=45056 flags=1808b0
[  530.794037] sd 4:0:0:0: [sdc] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  530.794038] sd 4:0:0:0: [sdc] tag#15 Sense Key : Illegal Request [current] 
[  530.794039] sd 4:0:0:0: [sdc] tag#15 Add. Sense: Unaligned write command

The errors appear randomly for the four disks, but only one is really failing badly withing zfs:

root@debian:[~]# zpool status
  pool: dpool
 state: DEGRADED
  scan: resilvered 19,9G in 0 days 00:32:14 with 0 errors on Sun Apr 26 19:21:30 2020
config:

    NAME                       STATE     READ WRITE CKSUM
    dpool                      DEGRADED     0     0     0
      raidz1-0                 DEGRADED     0     0     0
        slot3crypt             ONLINE       0     0     0
        replacing-1            UNAVAIL      0     0     0  insufficient replicas
          1293895313619506864  UNAVAIL      0     0     0  was /dev/mapper/slot4crypt/old
          slot4crypt           FAULTED      0    26     0  too many errors
        slot5crypt             ONLINE       0     0     0
        slot6crypt             ONLINE       0     0     0
    logs    
      mirror-1                 ONLINE       0     0     0
        slot1p2crypt2          ONLINE       0     0     0
        slot2p2crypt2          ONLINE       0     0     0

errors: No known data errors

Without ZFS the disk sdd (slot4crypt) is working without any issues. S.M.A.R.T not showing any issues at all.

c0d3z3r0 commented 4 years ago

uhm. interesting. https://github.com/openzfs/zfs/issues/4873#issuecomment-269074579 seems to work for me, too. but WHY?!

c0d3z3r0 commented 4 years ago

Finally, this turned out to be a system firmware issue. See https://review.coreboot.org/c/coreboot/+/40877

You can try this:

echo maximum_performance | sudo tee /sys/class/scsi_host/host*/link_power_management_policy

If that solves the problem, it's pretty likely that you have the same problem. Try to get a BIOS/UEFI update from your vendor.

dd1dd1 commented 4 years ago

That's interesting. My "unaligned write" errors have gone away after I moved the disks from an oldish socket 1366 mobo and a brand new socket 1151 mobo. It could have been the old bios on the old mobo not initializing the SSD SATA link correctly... K.O.

flo82 commented 4 years ago

had the same problems with brand new 14TB drives on a skylake motherboard. Setting GRUB_CMD_LINUX="libata.force=3.0" fixed this issue for me. This forces the sata controller running at 3 GBIT/s - for a rusty drive still enough.

I have not tested the maximum_performance switch for the ports. Waiting for a firmware/bios update... let's see

t-np commented 4 years ago

i have the same issues on a AMD x399 Threadripper board with 2 16TB Seagate exos Sata drives. I get mostly errors on one of them, i checked them recently with seagate windows tool and both seem fine. Currently i have the HDD Enclosure+Cabling in suspicion that they can not handle the full SATA 6 Speeds in a sustained manner, i'll rework the case this weekend and move around disks and see if it helps

link_power_management_policy was already on maximum_performance

bjquinn commented 3 years ago

I'm getting this too on four separate machines now. Swapping drives, cables, drive bay, etc. didn't work. But putting all the drives on an HBA solved it. Supermicro H11DSi motherboard on all 4 machines, seems only to happen when using SATA controller on motherboard. BIOS already upgraded to latest. Using zfs via latest Proxmox 6.2 (zfs root/boot). A scrub can trigger it every time, though some of the machines can go a while without issues if they're not stressed. I've reached out to SuperMicro, but they don't have anything helpful yet.

I'm primarily seeing this on HGST 10TB SATA drives, though I did see it a few times on an Intel SSD.

t-np commented 3 years ago

Swapping drives, cables, drive bay, etc. didn't work.

i switched cabling and used a new sata drive cage, so far my problems are gone

bjquinn commented 3 years ago

Swapping drives, cables, drive bay, etc. didn't work.

i switched cabling and used a new sata drive cage, so far my problems are gone

Yeah, that's what's weird about this one. I've resolved similar problems that way in the past, but it's not resolving it this time, on multiple different systems.

cluck commented 3 years ago

Can confirm too that this happens on at least three similar hosts all with an ASRock TRX40D8-2N2T Mainboard (SATAIII via TRX14, and/or via ASM 1061) and varying hard disks and SATA cables. Can't see the same happening with FreeNAS/TrueNAS running on the same setup, and neither when the setup is altered so that the disks are attached over an add-in LSI HBA card instead of using the Mainboard connectors.

Also confirming that the failure is reproducibly happening only after having sent many gigabytes of data over the SATA wires. The failures seems to suddenly affecting multipe ata links concurrently. My guess at this point is a kernel bug with some HBAs.

To reproduce: boot a Debian Buster (or Proxmox) from an additional disk, and build a raidz2 over four disks attached to the TRX40 or the ASM 1061, then concurrently run bonnie++ -d /path/to/zfs -u root -r $MB_RAM and fio --name=randrw --rw=randrw --direct=1 --ioengine=libaio --bs=16k --numjobs=1600 --rwmixread=10 --size=1M --runtime=600 --group_reporting and what it fail after ~15-20min.

Found with Proxmox kernel 5.4.73-1-pve, and still present in 5.4.114-1-pve. zfs-2.0.3-1~bpo10+1 zfs-kmod-2.0.4-pve1

bjquinn commented 3 years ago

I can reproduce with a simple zpool scrub. It's true that it only happens after many GBs of data transfer, but I can trigger it every time.

I was also able to eliminate the problem with an add in LSI 9300-4I HBA, but of course that's not a true fix. However, I had to get our servers stable, so I bought HBAs for all of them, and I haven't seen a single problem since.

Edit: I also had to update the HBA firmware to the latest in order to avoid hard reboots under the same type of load. I would assume that's unrelated (it was NOT the same "unaligned write error" in dmesg, it was a hard freeze/reboot), but I thought I should mention it. Anyway, an HBA with fully updated firmware does seem to have eliminated the problem for me across the several servers I was having this issue with, as a workaround.

cluck commented 3 years ago

Update: Kernel 5.10¹ from Debian Buster Backports, and Kernel 5.11 from Proxmox pvetest repo are stable on the TRX14 SATA links. They both continue to fail on the ASMedia ASM1062²; but less spectacularly so in dmesg.

¹ Debian Backports 5.10.0-0.bpo.5-amd64 ² I previously wrote ASM 1061, as the Mainboard manual reports it, but lspci reports ASM1062 (rev 02).

t-np commented 3 years ago

my problems returned in a differen kind of way on my personal host where i passthrough pcie devices incl. gpu. this is the host where i originally had the problems with the two sata 16TB exos drives where the issues seem to went away after i replaced the disk cage and cabling. But it seems the problem just moved on to the nvme port. I had one Samsung 970EVO with 2TB in that port and did not use it yet and had that disk in a ZFS Pool. Almost every time i scrubbed that nvme i got following errors on random sectors

┌Nov 8 00:26:18 kernel: [306450.056343] blk_update_request: critical medium error, dev nvme0n1, sector 172517081 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0                             │
│Nov 8 00:26:18 kernel: [306450.056380] zio pool=zfs vdev=/dev/disk/by-id/nvme-eui.0025385b81b21500-part1 error=61 type=1 offset=88327696896 size=131072 flags=1808b0                          │
└Nov 8 00:26:18 zed: eid=54 class=io pool_guid=0xFD58158A8A98C87E vdev_path=/dev/disk/by-id/nvme-eui.0025385b81b21500-part1                                               │
─Nov 8 00:26:47 kernel: [306479.686640] zio pool=zfs vdev=/dev/disk/by-id/nvme-eui.0025385b81b21500-part1 error=61 type=1 offset=153830233600 size=131072 flags=40080cb0                         │
┌Nov 8 00:26:47 kernel: [306479.686602] blk_update_request: critical medium error, dev nvme0n1, sector 300451723 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0                             │
│Nov 8 00:26:47 kernel: [306479.686640] zio pool=zfs vdev=/dev/disk/by-id/nvme-eui.0025385b81b21500-part1 error=61 type=1 offset=153830233600 size=131072 flags=40080cb0

it got somehow worse and was not just limited to scrub problems. my win10 guest that passes through multiple pcie devices from the proxmox host could not initialize hardware when booting up and locked the host up solid when the windows circle rotates and windows sorts out hardware, sometimes it got a screen further and locked up on applying configuration, and even more rarely it got to desktop and then locked up and in very rare cases it locked up hours days or even a week later. i fiddled so long around and i got a few hints in the console that it seem to be a hardware initialization issue. Sometimes it wrote something into dmesg just before locking up. sometimes USB3 Adapters that were passed through but i also noticed the nvme device that was not passed through at all.

This Samsung was counting up SMART inegrity errors everytime i ran zfs scrub and im right now far over Data Integrity Errors: 31 and Error Information Log Entries: 48 but Windows Samsung Magician just tells everything is fine. Samsung did not want to RMA the device initially and i might be out of warranty now...

In the end this was the single issue that was plaguing me for the last few months where i needed to hope and gamble to get my guest up and running. I first thought it was related to the kernel version of proxomox and the zfs module as there were some known issues that locked up the host in the latest Kernel. After i removed the Samsung NVME 970 EVO 2TB everything is fine and runs without any issues on proxmox kernel 5.11.17-1~bpo10

stuckj commented 2 years ago

This started happening to me pretty regularly on a supermicro virtualization server (X10SDV-8C-TLN4F). I have 4 500GB crucial MX500's in a 2x2 mirrored pool with one hot spare. Every couple weeks I'm seeing the hot spare kick in. The drive that dies always dies with the error mentioned at the top of this issue (unaligned write). Soft resetting the host bus doesn't work. Unplugging the SATA cable doesn't work. You have to unplug power to the drive and re-plug in to get it back up. All 5 drives are brand new (I replaced the old SSDs when this problem cropped up thinking they were failing...guess they weren't). It's not always the same drive that this happens with (it's happened across 3 different ones).

The old drives were Samsung EVO 860's and had the same problem. So...not manufacturer specific. It's still running the original BIOS so I may try updating that to see if it makes a difference.

This smells like a kernel bug though rather than a ZFS bug given that the failure is at the ATA level. Unless ZFS can write something that can result in an unaligned write.

stuckj commented 2 years ago

I do have trim enabled on the pool. Perhaps it could be related to this: https://github.com/openzfs/zfs/issues/8552. Though these aren't Samsung drives. I can try the noncq setting on the next reboot when I can do some downtime.

stuckj commented 2 years ago

Also, you can disable NCQ without a reboot using this technique: https://unix.stackexchange.com/a/611496

stuckj commented 2 years ago

Disabling NCQ with the above technique (the technique that doesn't require rebooting) was not effective for me. I'm going to try disabling autotrim and just leave it to the monthly manual trim (which I believe should run in a couple of hours since the default is the first Sunday of each month).

stuckj commented 2 years ago

Nope. That didn't work either. :-P I'm starting to suspect failing hardware for me now too (or a kernel bug in a recent update). To my knowledge, the D-1541 Xeon's are a SoC so all the SATA controllers are actually in the CPU. Seems weird it would fail, but there must be some components on the MB. I'm moving them to the backplane to see if that resolves the issue. Only had them on the internal ports for speed since they're each individual SATA III ports vs the shared SATA II backplane.

livelace commented 2 years ago

@stuckj Hello I see your "fight" and want to share mine. Since I wrote this short note - I haven't had "unaligned write command" problem. Hope this can help somehow.

bjquinn commented 2 years ago

@livelace thanks for the writeup. In my case though the issue predates zstd, for whatever that's worth.

stuckj commented 2 years ago

Yep, thanks @livelace. Doesn't apply in my case either though since the current problem I'm having is with Crucial MX500's (5 of them, all brand new, 2x2 mirrored pairs and a spare). Also using lz4 rather than zstd for me.

stuckj commented 2 years ago

I do have a bit of a performance hit, but so far everything is working on the backplane whereas every day for the past week I was getting at least 1 drive failure. Prior to the switch yesterday I had 4 fail at once which required me to reboot everything.

I guess it's possible that I have failing SATA ports on the motherboard. But...I don't think it's likely. Also, this started happening after applying both a kernel update and a ZFS update. I had been holding off on going from the latest 0.8.4 and updated kernel due to wanting the same ZFS version for my znapzend container. I finally got around to making a znapzend container based on the proxmox repos though so finally made the ZFS and kernel jump before this started happening.

I'm not sure how best to narrow this down to a ZFS or kernel bug, but it seems like it's one of those.

stuckj commented 2 years ago

Huh. This is the same symptoms from years ago on the exact same MB I'm using, though it's with intel SSDs instead of crucial. In that case it was the drive firmware that was updated to fix it. Crucial doesn't appear to have new firmware for the MX500's I'm using though:

https://community.intel.com/t5/Solid-State-Drives/S3610-SSDs-have-failed-quot-READ-WRITE-FPDMA-QUEUED-quot-ATA/m-p/291172

Guess I'm just stuck with the backplane for the time being (or until I upgrade the machine?). Meh, it makes drive maintenance easier I suppose.

frasc commented 2 years ago

I had a very similar issue, that is failed writes with the dmesg message: [Mon May 31 10:15:51 2021] sd 5:0:0:0: [sdj] tag#1 Add. Sense: Unaligned write command

My Hardware configuration introduces something different from the other cases:

I was having the issue with 'scrub' on both pools.

The way I solved it, or at least hope to have solved it, is by backing up all data, recreating the pools and restoring the data. I haven't seen the 'Unaligned write command' for a few months now.

There is something that probably needs to be confirmed: The older pools (with 'Unaligned command') were created under Ubuntu with ZFS 1.x. The problem appeared after Ubuntu upgraded from ZFS 1.x to ZFS 2.0.1 The two new pools (so far without 'Unaligned command') have been created under Gentoo with ZFS 2.1 and ZFS 2.0.4.

I have no doubt that the ZFS team verifies that a pool survives a ZFS upgrade.

The question is whether there could be a corner case where strange issues happen. I mean, ZFS is a hugely complicated filesystem, and it's conceivable that something could have been introduced. If the pools needed to be recreated when upgrading ZFS to a major release, that'd be most unfortunate.

Thank you.

fcmircea commented 2 years ago

Hello,

I have encountered this issue as well during a resilver. My config:

The OS is on another nvme ssd in the MB's nvme slot.

I had SATA / PCIE power mgmt on in bios.

HDDs are part of a raidz and the nvmes part of another raidz, all in the same pool.

Special note: the pool was crated in a slightly different configuration, where I had some of the HDDs connected via a JMB585 PCIE to SATA adapter. On that config, I did not encounter this issue at all.

I have now disabled any sata power mgmt features in bios, will keep an eye out if the error resurfaces.

LE:

Issue still occurs, as follows:


Jan 04 02:48:18 hal9002 kernel: ata2.00: exception Emask 0x10 SAct 0x4400200 SErr 0x280100 action 0x6 frozen
Jan 04 02:48:18 hal9002 kernel: ata2.00: irq_stat 0x08000000, interface fatal error
Jan 04 02:48:18 hal9002 kernel: ata2: SError: { UnrecovData 10B8B BadCRC }
Jan 04 02:48:18 hal9002 kernel: ata2.00: failed command: READ FPDMA QUEUED
Jan 04 02:48:18 hal9002 kernel: ata2.00: cmd 60/38:48:18:2c:ac/00:00:cc:00:00/40 tag 9 ncq dma 28672 in
                                         res 40/00:b0:30:ab:ac/00:00:cc:00:00/40 Emask 0x10 (ATA bus error)
Jan 04 02:48:18 hal9002 kernel: ata2.00: status: { DRDY }
Jan 04 02:48:18 hal9002 kernel: ata2.00: failed command: READ FPDMA QUEUED
Jan 04 02:48:18 hal9002 kernel: ata2.00: cmd 60/30:b0:30:ab:ac/00:00:cc:00:00/40 tag 22 ncq dma 24576 in
                                         res 40/00:b0:30:ab:ac/00:00:cc:00:00/40 Emask 0x10 (ATA bus error)
Jan 04 02:48:18 hal9002 kernel: ata2.00: status: { DRDY }
Jan 04 02:48:18 hal9002 kernel: ata2.00: failed command: READ FPDMA QUEUED
Jan 04 02:48:18 hal9002 kernel: ata2.00: cmd 60/30:d0:f0:96:ab/00:00:cc:00:00/40 tag 26 ncq dma 24576 in
                                         res 40/00:b0:30:ab:ac/00:00:cc:00:00/40 Emask 0x10 (ATA bus error)
Jan 04 02:48:18 hal9002 kernel: ata2.00: status: { DRDY }
Jan 04 02:48:18 hal9002 kernel: ata2: hard resetting link
Jan 04 02:48:18 hal9002 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 04 02:48:18 hal9002 kernel: ata2.00: configured for UDMA/133
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#9 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#9 Sense Key : Illegal Request [current]
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#9 Add. Sense: Unaligned write command
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#9 CDB: Read(16) 88 00 00 00 00 00 cc ac 2c 18 00 00 00 38 00 00
Jan 04 02:48:18 hal9002 kernel: blk_update_request: I/O error, dev sdb, sector 3433835544 op 0x0:(READ) flags 0x700 phys_seg 7 prio class 0
Jan 04 02:48:18 hal9002 kernel: zio pool=big vdev=/dev/disk/by-id/ata-WDC_WD30EFZX-68AWUN0_WD-WX12D41N8S8E-part1 error=5 type=1 offset=1758122749952 size=28672 flags=180980
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#22 Sense Key : Illegal Request [current]
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#22 Add. Sense: Unaligned write command
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#22 CDB: Read(16) 88 00 00 00 00 00 cc ac ab 30 00 00 00 30 00 00
Jan 04 02:48:18 hal9002 kernel: blk_update_request: I/O error, dev sdb, sector 3433868080 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jan 04 02:48:18 hal9002 kernel: zio pool=big vdev=/dev/disk/by-id/ata-WDC_WD30EFZX-68AWUN0_WD-WX12D41N8S8E-part1 error=5 type=1 offset=1758139408384 size=24576 flags=180980
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#26 Sense Key : Illegal Request [current]
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#26 Add. Sense: Unaligned write command
Jan 04 02:48:18 hal9002 kernel: sd 1:0:0:0: [sdb] tag#26 CDB: Read(16) 88 00 00 00 00 00 cc ab 96 f0 00 00 00 30 00 00
Jan 04 02:48:18 hal9002 kernel: blk_update_request: I/O error, dev sdb, sector 3433797360 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jan 04 02:48:18 hal9002 kernel: zio pool=big vdev=/dev/disk/by-id/ata-WDC_WD30EFZX-68AWUN0_WD-WX12D41N8S8E-part1 error=5 type=1 offset=1758103199744 size=24576 flags=180980
Jan 04 02:48:18 hal9002 kernel: ata2: EH complete
emwgit commented 2 years ago

Hi all,

thanks to @frasc !

The way I solved it, or at least hope to have solved it, is by backing up all data, recreating the pools and restoring the data. I haven't seen the 'Unaligned write command' for a few months now.

Seems that I can only partially confirm you observation:

BR & Thanks!

G8EjlKeK7CwVQP2acz2B commented 2 years ago

I am running into this unaligned write issue on the following motherboard running Ubuntu 20.04.2: Gigabyte Technology Co., Ltd. B450 AORUS M/B450 AORUS M, BIOS F51 07/29/2020 It shows up during high I/O rates during resilver. One drive reading and one drive writing works without issues, when there are more on the same controller I see this. The system is 8 drives, 4 mirrored pairs, drives are mix of WD20EFRX, WD30EFRX, WD40EFRX. I am using 4 SATA onboard and 4 on a 88SE9230 SATA controller card. In process of moving the onboard to another PCIe controller card to see if that fixes it.

fcmircea commented 2 years ago

I have to say, after moving to a dedicated LSI controller, this issue went away. I would assume the consumer board controller doesn’t handle that many hdds and nobody cares that much to work it out.

În mie., 18 mai 2022 la 13:31 G8EjlKeK7CwVQP2acz2B @.***> a scris:

I am running into this unaligned write issue on the following motherboard running Ubuntu 20.04.2: Gigabyte Technology Co., Ltd. B450 AORUS M/B450 AORUS M, BIOS F51 07/29/2020 It shows up during high I/O rates during resilver. One drive reading and one drive writing works without issues, when there are more on the same controller I see this. The system is 8 drives, 4 mirrored pairs, drives are mix of WD20EFRX, WD30EFRX, WD40EFRX. I am using 4 SATA onboard and 4 on a 88SE9230 SATA controller card. In process of moving the onboard to another PCIe controller card to see if that fixes it.

— Reply to this email directly, view it on GitHub https://github.com/openzfs/zfs/issues/10094#issuecomment-1129843399, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGEDH5OMMQRFP25MDR4DRFDVKTBHJANCNFSM4K75ICAQ . You are receiving this because you commented.Message ID: @.***>

G8EjlKeK7CwVQP2acz2B commented 2 years ago

I moved the 4 onboard SATA connections to another 88SE9230 card and this seems to help things. There are still issues with a couple of the drives though, so I just swapped their SATA cables out, we'll see. If I understand properly, unaligned write is supposed to be a zoned drive thing, which I do not have, so it's a little weird to get this so consistently under heavy load. Thanks for the LSI pointer, if you could give me the part number I may try that next.

xelayz commented 2 years ago

I am using TrueNAS Scale and also get this error:

May 22 22:22:44 truenas kernel: sd 1:0:0:0: [sdb] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s May 22 22:22:44 truenas kernel: sd 1:0:0:0: [sdb] tag#6 Sense Key : Illegal Request [current] May 22 22:22:44 truenas kernel: sd 1:0:0:0: [sdb] tag#6 Add. Sense: Unaligned write command May 22 22:22:44 truenas kernel: sd 1:0:0:0: [sdb] tag#6 CDB: Write(10) 2a 00 4c 40 21 70 00 00 30 00

I am using the Mini-PC HM-80 with 2x 2TB SSD 870 Evo. But this error only happen on one mainboard sata port. I switched the drive and the cables multiple times, but the error only happen on one mainboard port. I don't know what to do. I think the port on the mainboard is broken...

bjquinn commented 2 years ago

@xelayz, I don't know if this is very helpful, but this specific issue has been open for over two years (and may have been a problem long before that). I've simply given up connecting drives to motherboard ports and I just buy LSI HBAs and connect the drives to that. I don't know if that solution is reasonable for you, but it's the only one I've found after years of dealing with this problem. The one upside is that switching to an HBA has been a 100% fix for this problem for me across many systems.

G8EjlKeK7CwVQP2acz2B commented 2 years ago

I am also working through this at the moment. Moving from onboard port to adapter card did nothing, changing SATA cables did nothing, eliminating hot-swap tray did nothing, drive firmware doesn't seem to correlate. I replaced the drive with a standby drive (exact same) and the problem is not visible so far (still testing) so I ordered a replacement drive. If I find issues in further testing, will try an LSI adapter -- currently have dual Marvell 4-port adapters.

G8EjlKeK7CwVQP2acz2B commented 2 years ago

by the way, this is exactly why I converted from Linux LVM to ZFS, concern for data integrity -- resilvering/scrubbing seems to stress the hardware more but LVM gave me not much in the way of integrity guarantees

xelayz commented 2 years ago

The thing I do not understand is, that both ports are the same on my mainboard. But only one of it is not working fine. I mean ok, when the mainboard have some problem with my drives its ok. But then both ports should not work. This make no sense for me. And some of the users in this thread mentioned that only specific drives have this error. But in my case all drives are working fine on the other port. There is only something wrong on the first port. I read some people think its a issue between Samsung firmware and some boards. Ordered now 2 new SanDisk Ultra 3D 2TB disks. But this should not work, because it will make no sense as I described before. But anyway, I will try...

But I am sure that this problem has several reasons to occur. That makes it really hard to solve...

@bjquinn Thanks. But I want to use the internal ports, because I want a small setup. And I'm not good in such things, especially file systems like zfs. I just want to have my TrueNAS Scale system with this HM80 mini pc. :(

Edit: I connected a 500GB HDD to the affected port and so far no errors. Normally the error appear directly after the system restart. EDIT: Error is there again. Its a Hitachi HDD.

bjquinn commented 2 years ago

@G8EjlKeK7CwVQP2acz2B yeah, I went through all of the steps you're describing on several systems and none of it helped. But after I put in the LSI controllers, I've had zero issues since. I don't know if it's a zfs bug or zfs just highlighting that some of these controllers have firmware bugs, or if there are bugs in some of the drivers in the Linux kernel for some of these controllers. I suspect it's zfs though, as I've only started seeing this issue in the last few years, and it's a problem on lots of different controllers.

G8EjlKeK7CwVQP2acz2B commented 2 years ago

The electronics involved here are extremely high speed so the ports are not really "the same" in the sense that board trace lengths are not identical, nearby signals causing noise are different, etc., etc., etc. Then there is the possibility (probability!) of software / firmware bugs -- it can all be tested only to a certain extent, the combinations involved are awesome and the time it would require is ridiculous. So we kind of get what we get when we don't spend $$$$ and have to deal with some of the resulting pain.

dd1dd1 commented 2 years ago

I am the OP. I continue seeing "unaligned write" errors even with very recent versions of ZFS. I program data acquisition software similar to ZFS (but not as big or complicated) and I program DMA controllers and similar data transfer hardware. It is my informed opinion that this problem is a software bug in ZFS and/or in the Linux I/O paths that causes Linux to issue defective SATA write commands that are rejected by storage devices. It does not look like a hardware error. More likely, some obscure sata command alignment requirement (data alignment, command alignment, device address alignment, data padding, etc) is implemented differently by different devices and by different firmware versions. Figuring this out would require adding more diagnostics and debug code to ZFS and to the Linux sata drivers. People who suggest "try a different sata cable, try a different motherboard, try a different mouse" have no idea of what is going on in this specific case and are just sending everybody on expensive busy-work run-arounds.

bjquinn commented 2 years ago

@dd1dd1 I agree that "try a different sata cable, try a different motherboard, try a different mouse" are just busy work runarounds, but "try a different controller" works. I don't like that answer, and I wish this bug would get fixed, but if you need a server up and running and you can spare the expense, that's the only sure fire workaround I'm aware of at this time.

dd1dd1 commented 2 years ago

"try different controllers until you find one that seems to work" without any clear understanding why it works and why others do not work does not yield a reproducible solution. This is voodoo engineering. https://en.wikipedia.org/wiki/Voodoo_programming

I think "try different controller" only works because "different controller" has different firmware and different linux device driver that just by luck happens to not tickle the firmware of the storage devices the wrong way.

When people start reporting "I see the problem, installed controller X with firmware Y, problem gone", we may start getting somewhere. Except that you cannot purchase "controller X with firmware Y", you can purchase "controller X" and it will come with firmware "A" or "B" or "C" and you will have no idea if it works or not, and none of this helps the next person.

ZoomRmc commented 2 years ago

It is my informed opinion that this problem is a software bug in ZFS and/or in the Linux I/O paths that causes Linux to issue defective SATA write commands that are rejected by storage devices. It does not look like a hardware error. More likely, some obscure sata command alignment requirement (data alignment, command alignment, device address alignment, data padding, etc) is implemented differently by different devices and by different firmware versions.

To add one data point, I've hit the same error with a particular controller with bcachefs, so the error is not specific to zfs.

(Worked around with libata.force=noncq)

bjquinn commented 2 years ago

@dd1dd1 that's fine, but I think you're going to far with your complaint. I've tried several different LSI (or whatever they call themselves now) controllers, with several different firmware versions and ALL work. I know that doesn't solve the underlying problem, but from the perspective of an end user who sees an issue open for 26 months with no movement whatsoever, and doesn't have time to wait for whatever's going to have to happen to actually fix this issue (26 more months? 5 years?), "try an LSI controller" is a perfectly reasonable recommendation. This isn't someone shooting from the hip saying "try a different mouse", this is a real lived experience from someone who has solved (or "worked around", more accurately) this problem several times on several different systems.

If what you want me to do is give you detailed specifics on the original controller and firmware version, as well as the combinations of firmware and controller versions that resolved the problem, some of that I may be able to dig up and some of it may be lost to the sands of time. When I first commented on this issue in September of 2020, I could have probably come up with all of that in short order. I know that you're the OP, so you're a victim of this bug like just the rest of us, so if you think it would be useful for me to dig up all of that, I can attempt to do so.

bjquinn commented 2 years ago

@dd1dd1 Let me know if any of the following information is helpful.

I experienced this issue on at least 6 systems with a SuperMicro H11DSi motherboard, using the onboard SATA ports. Some were the v1 h/w revision of the motherboard, some others were the v2 h/w revision.

I'm not sure if I can figure out the details of each individual system, but I have some contemporaneous notes about at least one of them. I do not know the original BIOS version. On this one the motherboard was h/w revision 1. I upgraded to v1.3 BIOS which I would assume updated the onboard SATA firmware if there were any updates to apply. This did not resolve the problem. I did try different cables, drive trays, motherboard SATA ports, etc., which did not resolve anything. I was using Proxmox with zfs root, and updated to Proxmox 6.2-11 (which obviously upgraded the version of zfs) which also did not solve the problem.

What did solve the problem was installing a pair of LSI00344 9300-8i controllers. The controllers were updated to the following firmware (output from dmesg):

LSISAS3008: FWVersion(16.00.10.00), ChipRevision(0x02), BiosVersion(08.37.00.00)

I do not know what the original firmware revision was on the LSI controller. On at least one of these systems, I was having a (seemingly) unrelated issue causing system hangs that was resolved by updating the LSI controller firmware. I do not know if the system I'm detailing here was one of those, or if by this point I was simply updating the LSI firmware as a matter of course due to prior issues on nearly identical systems.

If there's other info that would be helpful, I'll see what I can come up with. The H11DSi motherboard was in common in all cases where I had this issue. In some cases, I used LSI00346 9300-4i controllers. I think most were updated to the version notated above, but I'm not 100% sure.

stuckj commented 2 years ago

Also if it's of some help. I don't have specific firmware versions either, but I was also running into this on a supermicro board (this one: https://www.supermicro.com/en/products/motherboard/X10SDV-TLN4F). I don't recall the bios version offhand and can't get to it right now. I moved my SATA drives from the onboard controllers to the backplane ( BPN-SAS2-846EL1) running off an LSI adapter (LSI 9211-8i flashed to HBA mode) which eliminated the problem.

I've recently (6 weeks ago) updated to proxmox 7.1 and moved the SATA drives back to the motherboard SATA controllers (had to 'cause I forgot I can't boot from the backplane and hadn't rebooted since moving them live!). I haven't run into this problem in the past 6 weeks. So...there's another data point. So, not sure if it has now been addressed in proxmox 7.1.

artw commented 2 years ago

Have the same issue with Samsung SSD 850 PRO 1TB and 00:17.0 SATA controller [0106]: Intel Corporation Comet Lake SATA AHCI Controller [8086:02d3] and it is pretty random. After setting pm to max_performance like suggested above I didn't have them for months but now it came back and brought my pool down twice this week.

DanielBarie commented 2 years ago

Chiming in on this issue. We have a server, Supermicro board ( H11DSi-NT Socket SP3 Rev. 2.0), two ZFS pools on disks attached to the internal controllers. One pool is SSDs, the other is HDDs. It is only with the SSDs that I see these errors. Firmware (BIOS) of the board is version 2.3 The error occured (up to now) on two of the three SSDs. Cables have been checked. No Smart errors reported for the SSDs. Scrubbing the affected pool seems to make it worse (i.e. boatload of errors during scrubbing). Errors occured while running PVE 7-0. The system is (now) running PVE 7-2.4 (recently upgraded). Let's wait and see... Update (22/06/08): SSDs (Seagate Nytro) are running the latest firmware (SF44214B).