ewwhite / zfs-ha

ZFS High-Availability NAS
749 stars 76 forks source link

Issues with unfencing #33

Open ACiDGRiM opened 4 years ago

ACiDGRiM commented 4 years ago

I'm having trouble getting the fencing component working and frequently neither system can mount the array.

I frequently have either no reservation tickets or both reservation tickets assigned to the storage array. I also have followed the wiki nearly exactly other than using a dual port HBA instead of two 1 port HBAs.

I'm using a Dell MD 1200 in my case.

ewwhite commented 4 years ago

Can you post log snippets?

ACiDGRiM commented 4 years ago

I've isolated it to an issue with SCSI reservations and write access to the array. If I mount the zpool with readonly=on I can mount it and there are no disk failures, but otherwise I get the below kernel error:

[ 1436.487415] WARNING: MMP writes to pool 'zfs_storage-array01' have not succeeded in over 167981 ms; suspending pool. Hrtime 1436487432373
[ 1436.487418] WARNING: Pool 'zfs_storage-array01' has encountered an uncorrectable I/O failure and has been suspended.

[ 1475.232670] INFO: task l2arc_feed:3466 blocked for more than 120 seconds.
[ 1475.232720]       Tainted: P          IOE    --------- -  - 4.18.0-193.6.3.el8_2.centos.plus.x86_64 #1
[ 1475.232721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1475.232723] l2arc_feed      D    0  3466      2 0x80004000
[ 1475.232726] Call Trace:
[ 1475.232740]  ? __schedule+0x24f/0x650
[ 1475.232745]  schedule+0x2f/0xa0
[ 1475.232748]  schedule_preempt_disabled+0xa/0x10
[ 1475.232750]  __mutex_lock.isra.5+0x2d0/0x4a0
[ 1475.232768]  ? __cv_timedwait_common+0xec/0x160 [spl]
[ 1475.232885]  l2arc_feed_thread+0xdb/0x420 [zfs]
[ 1475.232957]  ? l2arc_evict+0x2a0/0x2a0 [zfs]
[ 1475.232965]  ? __thread_exit+0x20/0x20 [spl]
[ 1475.232975]  thread_generic_wrapper+0x6f/0x80 [spl]
[ 1475.232980]  kthread+0x112/0x130
[ 1475.232983]  ? kthread_flush_work_fn+0x10/0x10
[ 1475.232985]  ret_from_fork+0x35/0x40

manually testing fence_scsi and only one path connected to the backplane.

fence_scsi -d /dev/mapper/35000c500b6f6b607,/dev/mapper/35000c500b6f71b8b,/dev/mapper/35000c500b6f71c27,/dev/mapper/35000c500b6f71c37,/dev/mapper/35000c500b6f7df7b,/dev/mapper/35000c500b6f80333,/dev/mapper/35000c500b6f8070f,/dev/mapper/35000c500b6f8072b,/dev/mapper/35000cca0131663d4,/dev/mapper/35000cca01317b5d8,/dev/mapper/35000cca01317bb94,/dev/mapper/35000cca01317beb8 -o on -k 3bcc0000 -v

/usr/bin/sg_persist -n -i -k -d /dev/mapper/35000c500b6f6b607
  PR generation=0x5, 1 registered reservation key follows:
    0x3bcc0000

I've powercycled the backplane and the reservation key is cleared. without writing to the disks do you know of a way to test if this issue is caused by write reservations on the backplane, or a zfs filesystem issue.

ewwhite commented 4 years ago

What hardware are you using here? What's connected to what?

ewwhite commented 4 years ago

I see MMP enabled. Disable that zpool option. It's likely causing the issue here.

ACiDGRiM commented 4 years ago

I've had multihost enabled on the zpool since it was created and haven't had this issue. However I can't disable MMP because the zfs module locks up, even when updating parameters in /sys only if I import with readonly=on can I access the data

Right now there is just one host connected to the enclosure with a single path and I have this issue on both hosts, even if one is shutoff.

the full system below:

2x Dell R610
    9211-8e HBA in IT mode
    Intel x520-2x 10G uplink
1x Dell MD1200
    8x Seagate ST10000NM0096 (storage)
    4x Hitachi HUSRL402 CLAR200 (ZIL and L2ARC)
Centos 8.2 plus kernel (for internal RAID compatibility)
    4.18.0-193.6.3.el8_2.centos.plus.x86_64
    ZFS 0.8.4

I verified all devices have the latest firmware/BIOS

Host 1 is connected to ports 1 on each enclosure controller Host 2 is connected to ports 2 on each enclosure controller I've tried connecting in an X (host 1 connects to port 1 on controller 1 and port 2 on controller 2) but both hosts hang on boot when trying to initialize the disks.