I/O wait due to faulty multipath devices

Hi there 👋

we have an older Openshift cluster that's showing some weird storage behaviour. At given points (x-hours after a node reboot, after a small network hickup,..) we see massive I/O wait on the Openshift nodes. Some multipath devices are then listed in a failed faulty running state. The amount of I/O wait is linear with the amount of iscsi devices that are in faulty state. We see numbers between 20-40% Despite the load & failed devices, the cluster is still functioning normally. The iscsi luns are also still accessible in the pods. The nodes don't recover automatically from this issue. We have to remove iscsi devices by hand.

This issue happens frequently and is hard to debug. It could be either Redhat, Openshift, Netapp, Trident or our own config.

We have a working fix when the issue occurs. However the I/O wait is getting annoying and we'd like to implement a more decent fix in the meantime. I hope this story sounds familiar with someone. I'm looking for a push from some storage experts in the right direction.

I found https://github.com/NetApp/trident/issues/101 & https://github.com/NetApp/trident/issues/133 which seem related or at least show the same symptoms.

I don't understand the outcome of the first issue, but I imagine it should already be fixed given the date.
The portals are correctly set in our pv's.

Versions

Openshift: 3.11.394 Trident: 19.10.1 RHEL: 7.9 Netapp: ontap select - NetApp Release 9.6P1: Fri Jul 19 02:29:12 UTC 2019

Logs

DMESG

[180444.103477] blk_update_request: I/O error, dev sdr, sector 209715072
[180444.104021] Buffer I/O error on dev sdr, logical block 26214384, async page read
[180444.105202] sd 3:0:0:14: [sdr] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[180444.105206] sd 3:0:0:14: [sdr] Sense Key : Illegal Request [current]
[180444.105208] sd 3:0:0:14: [sdr] Add. Sense: Logical unit not supported
[180444.105211] sd 3:0:0:14: [sdr] CDB: Read(10) 28 00 0c 7f ff 80 00 00 08 00
[180444.105212] blk_update_request: I/O error, dev sdr, sector 209715072
[180444.105705] Buffer I/O error on dev sdr, logical block 26214384, async page read
[180444.157530] sd 3:0:0:43: [sdau] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[180444.157536] sd 3:0:0:43: [sdau] Sense Key : Illegal Request [current]
[180444.157540] sd 3:0:0:43: [sdau] Add. Sense: Logical unit not supported
[180444.157544] sd 3:0:0:43: [sdau] CDB: Read(10) 28 00 00 9f ff 80 00 00 08 00
[180444.157546] blk_update_request: I/O error, dev sdau, sector 10485632
[180444.158466] Buffer I/O error on dev sdau, logical block 1310704, async page read
[180444.160325] sd 3:0:0:43: [sdau] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[180444.160329] sd 3:0:0:43: [sdau] Sense Key : Illegal Request [current]
[180444.160333] sd 3:0:0:43: [sdau] Add. Sense: Logical unit not supported
[180444.160337] sd 3:0:0:43: [sdau] CDB: Read(10) 28 00 00 9f ff 80 00 00 08 00
[180444.160339] blk_update_request: I/O error, dev sdau, sector 10485632
[180444.161194] Buffer I/O error on dev sdau, logical block 1310704, async page read

Multipath -ll

...
3600a098056303030302b4f7733793843 dm-69 NETAPP  ,LUN C-Mode
size=10G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 3:0:0:15 sds  65:32  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  `- 4:0:0:15 sdbk 67:224 active ready running
3600a098056303030302b4f7733784651 dm-25 NETAPP  ,LUN C-Mode
size=100G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 3:0:0:13 sdq  65:0   failed faulty running
`-+- policy='service-time 0' prio=0 status=enabled
  `- 4:0:0:13 sdbi 67:192 failed faulty running
3600a098056303030302b4f7733793842 dm-68 NETAPP  ,LUN C-Mode
size=100G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| `- 3:0:0:14 sdr  65:16  failed faulty running
`-+- policy='service-time 0' prio=0 status=enabled
  `- 4:0:0:14 sdbj 67:208 failed faulty running
3600a098056303030303f4f7748303877 dm-67 NETAPP  ,LUN C-Mode
size=100G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=0 status=enabled
  |- 4:0:0:12 sdbh 67:176 failed faulty running
  `- 3:0:0:12 sdp  8:240  failed faulty running
...

lsscsi

[1:0:0:0]    cd/dvd  NECVMWar VMware IDE CDR10 1.00  /dev/sr0
[2:0:0:0]    disk    VMware   Virtual disk     2.0   /dev/sda
[2:0:1:0]    disk    VMware   Virtual disk     2.0   /dev/sdb
[2:0:2:0]    disk    VMware   Virtual disk     2.0   /dev/sdc
[3:0:0:0]    disk    NETAPP   LUN C-Mode       9600  /dev/sdd
[3:0:0:1]    disk    NETAPP   LUN C-Mode       9600  /dev/sde
[3:0:0:3]    disk    NETAPP   LUN C-Mode       9600  /dev/sdg
[3:0:0:4]    disk    NETAPP   LUN C-Mode       9600  /dev/sdh
[3:0:0:5]    disk    NETAPP   LUN C-Mode       9600  /dev/sdi
[3:0:0:6]    disk    NETAPP   LUN C-Mode       9600  /dev/sdj
[3:0:0:7]    disk    NETAPP   LUN C-Mode       9600  /dev/sdk
[3:0:0:8]    disk    NETAPP   LUN C-Mode       9600  /dev/sdl
[3:0:0:9]    disk    NETAPP   LUN C-Mode       9600  /dev/sdm
[3:0:0:10]   disk    NETAPP   LUN C-Mode       9600  /dev/sdn
[3:0:0:11]   disk    NETAPP   LUN C-Mode       9600  /dev/sdo
[3:0:0:12]   disk    NETAPP   LUN C-Mode       9600  /dev/sdp
[3:0:0:13]   disk    NETAPP   LUN C-Mode       9600  /dev/sdq
[3:0:0:14]   disk    NETAPP   LUN C-Mode       9600  /dev/sdr
[3:0:0:15]   disk    NETAPP   LUN C-Mode       9600  /dev/sds
[3:0:0:16]   disk    NETAPP   LUN C-Mode       9600  /dev/sdt
[3:0:0:17]   disk    NETAPP   LUN C-Mode       9600  /dev/sdu
[3:0:0:18]   disk    NETAPP   LUN C-Mode       9600  /dev/sdv
[3:0:0:19]   disk    NETAPP   LUN C-Mode       9600  /dev/sdw
[3:0:0:20]   disk    NETAPP   LUN C-Mode       9600  /dev/sdx
[3:0:0:21]   disk    NETAPP   LUN C-Mode       9600  /dev/sdy
[3:0:0:22]   disk    NETAPP   LUN C-Mode       9600  /dev/sdz
[3:0:0:23]   disk    NETAPP   LUN C-Mode       9600  /dev/sdaa
[3:0:0:24]   disk    NETAPP   LUN C-Mode       9600  /dev/sdab
[3:0:0:25]   disk    NETAPP   LUN C-Mode       9600  /dev/sdac
[3:0:0:26]   disk    NETAPP   LUN C-Mode       9600  /dev/sdad
[3:0:0:27]   disk    NETAPP   LUN C-Mode       9600  /dev/sdae
[3:0:0:28]   disk    NETAPP   LUN C-Mode       9600  /dev/sdaf
[3:0:0:29]   disk    NETAPP   LUN C-Mode       9600  /dev/sdag
[3:0:0:30]   disk    NETAPP   LUN C-Mode       9600  /dev/sdah
[3:0:0:31]   disk    NETAPP   LUN C-Mode       9600  /dev/sdai
[3:0:0:32]   disk    NETAPP   LUN C-Mode       9600  /dev/sdaj
[3:0:0:33]   disk    NETAPP   LUN C-Mode       9600  /dev/sdak
[3:0:0:34]   disk    NETAPP   LUN C-Mode       9600  /dev/sdal
[3:0:0:35]   disk    NETAPP   LUN C-Mode       9600  /dev/sdam
[3:0:0:36]   disk    NETAPP   LUN C-Mode       9600  /dev/sdan
[3:0:0:37]   disk    NETAPP   LUN C-Mode       9600  /dev/sdao
[3:0:0:38]   disk    NETAPP   LUN C-Mode       9600  /dev/sdap
[3:0:0:39]   disk    NETAPP   LUN C-Mode       9600  /dev/sdaq
[3:0:0:40]   disk    NETAPP   LUN C-Mode       9600  /dev/sdar
[3:0:0:41]   disk    NETAPP   LUN C-Mode       9600  /dev/sdas
[3:0:0:42]   disk    NETAPP   LUN C-Mode       9600  /dev/sdat
[3:0:0:43]   disk    NETAPP   LUN C-Mode       9600  /dev/sdau
[4:0:0:0]    disk    NETAPP   LUN C-Mode       9600  /dev/sdav
[4:0:0:1]    disk    NETAPP   LUN C-Mode       9600  /dev/sdaw
[4:0:0:3]    disk    NETAPP   LUN C-Mode       9600  /dev/sday
[4:0:0:4]    disk    NETAPP   LUN C-Mode       9600  /dev/sdaz
[4:0:0:5]    disk    NETAPP   LUN C-Mode       9600  /dev/sdba
[4:0:0:6]    disk    NETAPP   LUN C-Mode       9600  /dev/sdbb
[4:0:0:7]    disk    NETAPP   LUN C-Mode       9600  /dev/sdbc
[4:0:0:8]    disk    NETAPP   LUN C-Mode       9600  /dev/sdbd
[4:0:0:9]    disk    NETAPP   LUN C-Mode       9600  /dev/sdbe
[4:0:0:10]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbf
[4:0:0:11]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbg
[4:0:0:12]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbh
[4:0:0:13]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbi
[4:0:0:14]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbj
[4:0:0:15]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbk
[4:0:0:16]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbl
[4:0:0:17]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbm
[4:0:0:18]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbn
[4:0:0:19]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbo
[4:0:0:20]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbp
[4:0:0:21]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbq
[4:0:0:22]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbr
[4:0:0:23]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbs
[4:0:0:24]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbt
[4:0:0:25]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbu
[4:0:0:26]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbv
[4:0:0:27]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbw
[4:0:0:28]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbx
[4:0:0:29]   disk    NETAPP   LUN C-Mode       9600  /dev/sdby
[4:0:0:30]   disk    NETAPP   LUN C-Mode       9600  /dev/sdbz
[4:0:0:31]   disk    NETAPP   LUN C-Mode       9600  /dev/sdca
[4:0:0:32]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcb
[4:0:0:33]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcc
[4:0:0:34]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcd
[4:0:0:35]   disk    NETAPP   LUN C-Mode       9600  /dev/sdce
[4:0:0:36]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcf
[4:0:0:37]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcg
[4:0:0:38]   disk    NETAPP   LUN C-Mode       9600  /dev/sdch
[4:0:0:39]   disk    NETAPP   LUN C-Mode       9600  /dev/sdci
[4:0:0:40]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcj
[4:0:0:41]   disk    NETAPP   LUN C-Mode       9600  /dev/sdck
[4:0:0:42]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcl
[4:0:0:43]   disk    NETAPP   LUN C-Mode       9600  /dev/sdcm

multipath.conf

defaults {
        user_friendly_names yes
        find_multipaths yes
}

blacklist {
}
devices {
device {
          vendor "LIO-ORG"
          product "TCMU device"
          user_friendly_names "yes"
          path_grouping_policy "failover"
          path_selector "round-robin 0"
          failback immediate
          path_checker "tur"
          prio "alua"
          hardware_handler "1 alua"
          no_path_retry 120
          rr_weight "uniform"
}
}

Applied workaround

# remove the faulty devicemapper intance
dmsetup remove -f ${LUN_ID}

# flush the faulty multipath device
multipath -f ${LUN_ID}

# remove the faulty multipath wwid from the config (if present)
vi /etc/multipath/wwids

# remove the now obsolete devices
echo 1 > /sys/block/${DISK_ID}/device/delete
echo 1 > /sys/block/${DISK_ID}/device/delete

# restart multipathd & verify that everything is now fine
systemctl restart multipathd

NetApp / trident