opensvc / multipath-tools

Other
59 stars 47 forks source link

one of path failure cause filesystem read-only #6

Closed jinleiw closed 3 years ago

jinleiw commented 3 years ago

I use multipath for my storage array.

there are 4 path to the storage:

DATA01 (3600507640081011a000000000000001e) dm-4 IBM     ,2145            
size=500G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=enabled
| `- 8:0:1:0  sdj 8:144 active ready running
|-+- policy='service-time 0' prio=50 status=active
| `- 10:0:1:0 sdn 8:208 active ready running
|-+- policy='service-time 0' prio=10 status=enabled
| `- 8:0:0:0  sdc 8:32  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  `- 10:0:0:0 sdb 8:16  active ready running

I unplug one of fibre to test failover, but I got the filesystem read-only, and found error message:

Apr 27 21:20:37 localhost kernel: qla2xxx [0000:81:00.1]-500b:10: LOOP DOWN detected (2 7 0 0).
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: [sdb] killing request
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: EXT4-fs warning (device dm-8): ext4_end_bio:316: I/O error -5 writing to inode 24117995 (offset 0 size 0 starting block 96511296)
Apr 27 21:20:42 localhost kernel: Buffer I/O error on device dm-8, logical block 96511296
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: Aborting journal on device dm-8-8.
Apr 27 21:20:42 localhost kernel: EXT4-fs error (device dm-8) in ext4_reserve_inode_write:5173: Journal has aborted
Apr 27 21:20:42 localhost kernel: EXT4-fs (dm-8): Delayed block allocation failed for inode 24118064 at logical offset 3715516 with max blocks 4 with error 30
Apr 27 21:20:42 localhost kernel: EXT4-fs (dm-8): This should not happen!! Data will be lost
Apr 27 21:20:42 localhost kernel: EXT4-fs error (device dm-8) in ext4_writepages:2543: Journal has aborted
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: [sdb] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: [sdb] CDB: Write(10) 2a 00 00 4c d0 00 00 00 60 00
Apr 27 21:20:42 localhost kernel: blk_update_request: I/O error, dev sdb, sector 5033984
Apr 27 21:20:42 localhost kernel: Buffer I/O error on dev dm-8, logical block 129531904, lost sync page write
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: JBD2: Error -5 detected when updating journal superblock for dm-8-8.
Apr 27 21:20:42 localhost kernel: Buffer I/O error on dev dm-8, logical block 0, lost sync page write
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: Buffer I/O error on dev dm-8, logical block 0, lost sync page write
Apr 27 21:20:42 localhost kernel: EXT4-fs error (device dm-8) in ext4_dirty_inode:5290: Journal has aborted
Apr 27 21:20:42 localhost kernel: EXT4-fs (dm-8): previous I/O error to superblock detected
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: Buffer I/O error on dev dm-8, logical block 0, lost sync page write
Apr 27 21:20:42 localhost kernel: EXT4-fs (dm-8): I/O error while writing superblock
Apr 27 21:20:42 localhost kernel: EXT4-fs error (device dm-8): ext4_journal_check_start:56: 
Apr 27 21:20:42 localhost kernel: EXT4-fs error (device dm-8): ext4_journal_check_start:56: 
Apr 27 21:20:42 localhost kernel: Detected aborted journal
Apr 27 21:20:42 localhost kernel: 
Apr 27 21:20:42 localhost kernel: EXT4-fs (dm-8): Remounting filesystem read-only
Apr 27 21:20:42 localhost kernel: sd 10:0:0:0: rejecting I/O to offline device
Apr 27 21:20:42 localhost kernel: Buffer I/O error on dev dm-8, logical block 0, lost sync page write
Apr 27 21:20:42 localhost kernel: Detected aborted journal
Apr 27 21:20:42 localhost kernel: 

My conf is:

defaults {
    user_friendly_names no
    find_multipaths yes
}

blacklist {
       devnode "^hda"
       wwid    3600508b1001cfeb7dd7bd98c96cd5044
}
multipaths {

   multipath {
      wwid                          3600507640081011a000000000000001e
      alias                         DATA01
      path_grouping_policy          failover
     path_selector                 "round-robin 0"
      failback                      manual
     rr_weight                     priorities
     no_path_retry                 0
   }
   ...
}

devices {

   device {
                vendor                     "IBM"
                product                    "V7000"
                path_grouping_policy       failover
                path_checker               tur
                rr_weight                  priorities
        }
}

Maybe there is a bug, or my conf is error?

mwilck commented 3 years ago

This is not an error in multipath-tools, it looks like a kernel issue. dm-multipath should fail over to another PG, but instead it passes the error up to the filesystem, which shouldn't happen. What kernel are your running?

Could you enable SCSI logging before pulling the cable please?

# sysctl -w dev.scsi.logging_level=8192

Also, please run multipathd with "-v3" (or set verbosity 3 in multipath.conf) and provide the output.

xosevp commented 3 years ago

It is a Fedora based distro (RHEL, Centos, Oracle, ....). And your config is totally wrong for this IBM array.

Do:

# save old configs
mv /etc/multipath.conf /etc/_multipath.conf-$(date +%s)
cp -a /etc/multipath/wwids /etc/multipath/_wwids-$(date +%s)
# reconfig mp
mpathconf --enable --user_friendly_names n
multipath -W
systemctl enable multipathd.service

If IBM/2145 is NOT present in the default config: # multipath -t you must add this to /etc/multipath.conf :

devices {
        device {
                vendor "IBM"
                product "^2145"
                path_grouping_policy "group_by_prio"
                prio "alua"
                failback "immediate"
                no_path_retry "queue"
        }
}

And then:

# recreate initrd, and reboot the system
dracut -f
init 6 
jirib commented 3 years ago

If IBM/2145 is NOT present in the default config: # multipath -t you must add this to /etc/multipath.conf :

devices {
        device {
                vendor "IBM"
                product "^2145"
                path_grouping_policy "group_by_prio"
                prio "alua"
                failback "immediate"
                no_path_retry "queue"
        }
}

^^ that seems wrong, see https://www.ibm.com/docs/en/flashsystem-v9000/8.2.x?topic=system-settings-linux-hosts and more importantly https://www.ibm.com/docs/en/flashsystem-v9000/8.2.x?topic=htrlos-attachment-requirements-hosts-that-are-running-linux-operating-system.

mwilck commented 3 years ago

You have provided the same link twice, and the link is unrelated to multipath.conf settings.

I can confirm that @xosevp's sample matches the default config built into multipath-tools for IBM 2145. @jinleiw's setting for "V7000" is ineffective. "V7000" may be the marketing name of your device, but what matters here is the product name that the device tells to host in the SCSI INQUIRY, which is 2145. It's a very unfortunate habit of hardware vendors to sell products under names that are totally unrelated to the actual technical product name.

mwilck commented 3 years ago

If IBM/2145 is NOT present in the default config:

That's quite unlikely, as the configuration for IBM 2145 has been unchanged in our code since 2016 (0.6.4).

It is a Fedora based distro

@jinleiw / @jirib, in general, if you have issues with the multipath versions shipped with your distribution, please use your distribution's support facilities rather than this upstream issue tracker.

jirib commented 3 years ago

If IBM/2145 is NOT present in the default config:

That's quite unlikely, as the configuration for IBM 2145 has been unchanged in our code since 2016 (0.6.4).

It is a Fedora based distro

@jinleiw / @jirib, in general, if you have issues with the multipath versions shipped with your distribution, please use your distribution's support facilities rather than this upstream issue tracker.

I only wanted to point out that IBM recommends different values that defaults in multipath-tools. (I updated the first link.) BTW, I don't think it makes sense to update multipath.conf to every HW vendor recommendations. People using such HW should first read HW vendor documentation, not just depends on mostly sane defaults.

mwilck commented 3 years ago

Well, except for no_path_retry, the settings are the same (some are missing above, but the defaults match IBM's recommendations). no_path_retry is a setting that mostly depends on data center preferences. Yet it's interesting that they use 5 for every distro.

@xosevp, would you say that we should update our defaults?

xosevp commented 3 years ago

BTW, I don't think it makes sense to update multipath.conf to every HW vendor recommendations.

At least it's needed for installation on multipath ROOT disks, or in systems rescue DVD-ROM/ISOs. And very often arrays docs disappear from the NET.

People using such HW should first read HW vendor documentation, not just depends on mostly sane defaults.

A lot of vendor's docs are out of date and sometimes totally wrong. The defaults in multipath-tools are based on vendor's recommendations, and provide a stable and performance setup.

mwilck commented 3 years ago

A lot of vendor's docs are out of date and sometimes totally wrong. The defaults in multipath-tools are based on vendor's recommendations, and provide a stable and performance setup.

I agree. However, seeing a numeric value for no_path_retry, I wonder if the vendor has spent some extra effort to determine a value that matches their hardware. It could be something like "typical time required for a storage node reboot" or something like that (even 25s seems a little low for that).

mwilck commented 3 years ago

The OP's problem is a kernel issue, the rest is discussion about HW defaults, for which no ideal solution exists (IBM 2145 covers a wide range of devices with likely different characteristics). I suggest closing this issue.

mwilck commented 3 years ago

Closing.