openzfsonwindows / openzfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
460 stars 16 forks source link

Pool and hostid mismatch and wrong partition shown #91

Open Mirkic7 opened 2 years ago

Mirkic7 commented 2 years ago

I'm using OpenZFSOnWindows-debug-2.1.99-993-g5057d967c-dirty.exe on Windows 10 21H2.

After install I can use partition for ZFS and another on SSD for L2ARC. This works fine. However, upon reboot I get following error:

C:\WINDOWS\system32>zpool status
  pool: tank
 state: ONLINE
status: Mismatch between pool hostid and system hostid on imported pool.
        This pool was previously imported into a system with a different hostid,
        and then was verbatim imported into this system.
action: Export this pool on all systems on which it is imported.
        Then import it to correct the mismatch.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
config:

        NAME                   STATE     READ WRITE CKSUM
        tank                   ONLINE       0     0     0
          Harddisk1Partition0  ONLINE       0     0     0
        cache
          Harddisk0Partition0  ONLINE       0     0     0

errors: No known data errors

C:\WINDOWS\system32>

Pool can be imported (and is imported) by using 'zpool import -f tank'. This is on local system, and had same experience when I did same test under VM with Windows 10.

2nd issue is it shows L2ARC as being 'Harddisk0Partition0', which is incorrect. It is partition 3 on disk 0. This was shown correctly on pool creation, but after reboot it shows wrongly.

Here's log of starting pool:

C:\WINDOWS\system32>zpool import tank
path '\\?\scsi#disk&ven_st1000lm&prod_048-2e7172#4&1354137a&0&000100#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive1'
read partitions ok 2
    gpt 0: type 8bbbd4a0 off 0x100000 len 0x6225000000
    gpt 1: type 8bbbd4a0 off 0x6225100000 len 0x86bbb00000
asking libefi to read label
EFI read OK, max partitions 128
    part 0:  offset 800:    len 31128000:    tag: 11    name: 'Basic data partition'
    part 1:  offset 31128800:    len 435dd800:    tag: 11    name: 'Basic data partition'
path '\\?\usbstor#disk&ven_asmt&prod_2105&rev_0#00000000000000000000&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive2'
read partitions ok 4
    mbr 0: type 7 off 0x100000 len 0x7470900000
    mbr 1: type 0 off 0x0 len 0x0
    mbr 2: type 0 off 0x0 len 0x0
    mbr 3: type 0 off 0x0 len 0x0
asking libefi to read label
path '\\?\scsi#disk&ven_nvme&prod_intel_ssdpeknw51#5&9d49a0f&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive0'
read partitions ok 5
    gpt 0: type 8bbbd4a0 off 0x100000 len 0x30700000
    gpt 1: type 8bbbd4a0 off 0x30800000 len 0x1000000
    gpt 2: type 8bbbd4a0 off 0x31800000 len 0x676ec00000
    gpt 3: type 8bbbd4a0 off 0x67a0400000 len 0xf75d00000
    gpt 4: type 8bbbd4a0 off 0x7716200000 len 0x26000000
asking libefi to read label
EFI read OK, max partitions 128
    part 0:  offset 800:    len 183800:    tag: c    name: 'SYSTEM'
    part 1:  offset 184000:    len 8000:    tag: 10    name: 'Micr'
    part 2:  offset 18c000:    len 33b76000:    tag: 11    name: 'Basi'
    part 3:  offset 33d02000:    len 7bae800:    tag: 11    name: 'Basic data partition'
Processing volume '\\?\Volume{17a25502-73e7-4b13-9535-88862e144f70}'
Processing volume '\\?\Volume{73f3ee71-d9f5-47e4-b05f-cd254350d6a4}'
Processing volume '\\?\Volume{3e1f107f-f774-41b3-8dcf-e74be1575b1f}'
Processing volume '\\?\Volume{1819d777-9364-4d78-9dc3-66cdc3e76f77}'
Processing volume '\\?\Volume{f2b1d2f7-8007-4f07-822c-a9873ecd9006}'
Processing volume '\\?\Volume{fdf8a5f7-0000-0000-0000-100000000000}'
Processing volume '\\?\Volume{36f8e7ef-e55e-4a2b-b266-518859531ec2}'
Processing volume '\\?\Volume{88ef013c-c5f7-11ec-9463-34415dff1195}'
working on dev '#1048576#421527552000#\\?\scsi#disk&ven_st1000lm&prod_048-2e7172#4&1354137a&0&000100#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
setting path here '/dev/Harddisk1Partition0'
setting physpath here '#1048576#421527552000#\\?\scsi#disk&ven_st1000lm&prod_048-2e7172#4&1354137a&0&000100#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
working on dev '#445070180352#66401075200#\\?\scsi#disk&ven_nvme&prod_intel_ssdpeknw51#5&9d49a0f&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
setting path here '/dev/Harddisk0Partition0'
setting physpath here '#445070180352#66401075200#\\?\scsi#disk&ven_nvme&prod_intel_ssdpeknw51#5&9d49a0f&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'

C:\WINDOWS\system32>

And actual partitions:

DiskNumber DriveLetter VolumeID
---------- ----------- --------
         0            \\?\Volume{36f8e7ef-e55e-4a2b-b266-518859531ec2}\
         0
         0           C \\?\Volume{17a25502-73e7-4b13-9535-88862e144f70}\
         0            \\?\Volume{73f3ee71-d9f5-47e4-b05f-cd254350d6a4}\
         0            \\?\Volume{3e1f107f-f774-41b3-8dcf-e74be1575b1f}\
         1            \\?\Volume{1819d777-9364-4d78-9dc3-66cdc3e76f77}\
         1           D \\?\Volume{f2b1d2f7-8007-4f07-822c-a9873ecd9006}\
         2           U \\?\Volume{fdf8a5f7-0000-0000-0000-100000000000}\
lundman commented 2 years ago

It is concerning it is showing Harddisk0Partition0 as it could potentially overwrite what happens to be on there. Be careful. I do wonder what gets written to the label (usually can be inspected with zdb -l) right after you create it, and export. If that is correct, then it is an import issue.

The hostid is surprising as well, but could be related to the wrong l2arc disk. It is supposed to make up a hostid the very first time you run ZFS, and store it in Registry. After that, it uses that Registry value.

Mirkic7 commented 2 years ago

Yes, that concern has me now remove it from this computer, so will try in VM more tests, as it seems hostid mismatch might be tied to L2ARC (which survives reboots fine, so persistence works). On a good note, it did show correct size for L2ARC when I checked with 'zpool iostat -v 1'.

andrewc12 commented 2 years ago

I get that import problem as well.

PS C:\Users\andre> zpool import
path '\\?\scsi#disk&ven_samsung&prod_mz7ln512hchp-000#4&558e46b&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive0'
read partitions ok 6
    gpt 0: type b5b9d820 off 0x100000 len 0x21100000
    gpt 1: type b5b9d820 off 0x21200000 len 0x6400000
    gpt 2: type b5b9d820 off 0x27600000 len 0x1000000
    gpt 3: type b5b9d820 off 0x28600000 len 0x1ffff00000
    gpt 4: type b5b9d820 off 0x2028600000 len 0x4346500000
    gpt 5: type b5b9d820 off 0x636ec00000 len 0x13cd500000
asking libefi to read label
EFI read OK, max partitions 128
    part 1:  offset 109000:    len 32000:    tag: c    name: 'EFI system partition'
    part 2:  offset 13b000:    len 8000:    tag: 10    name: 'Microsoft reserved partition'
    part 3:  offset 143000:    len ffff800:    tag: 11    name: 'Basic data partition'
    part 4:  offset 10143000:    len 21a32800:    tag: 11    name: 'Basic data partition'
    part 5:  offset 31b76000:    len 9e6a800:    tag: 11    name: 'Basic data partition'
path '\\?\usbstor#disk&ven_generic&prod_storage_device&rev_0828#000000000828&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive1'
read partitions ok 1
    mbr 0: type 0 off 0x0 len 0x0
asking libefi to read label
Processing volume '\\?\Volume{3d9670c7-482b-47ae-b690-f6217b40853b}'
Processing volume '\\?\Volume{a13fae2a-fee7-4eaf-ab81-d629f7340a72}'
Processing volume '\\?\Volume{37703b23-d9a9-440f-8210-354a429cac56}'
Processing volume '\\?\Volume{0408e664-fccb-4ede-a9a6-8d0aa9fdeaf0}'
Processing volume '\\?\Volume{0cd1886e-7fdf-48cf-970b-b9704df68c77}'
Processing volume '\\?\Volume{e991afdd-80bc-11ec-98c8-8c882b1000e2}'
Processing volume '\\?\Volume{d05066ee-4d35-11ec-98a7-806e6f6e6963}'
working on dev '\\?\Volume{a13fae2a-fee7-4eaf-ab81-d629f7340a72}'
setting path here '/dev/Harddisk0Partition4'
setting physpath here '\\?\Volume{a13fae2a-fee7-4eaf-ab81-d629f7340a72}'
   pool: tank5
     id: 7019695533760501658
  state: ONLINE
status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
 config:

        tank5                  ONLINE
          Harddisk0Partition4  ONLINE
PS C:\Users\andre>
yurikoles commented 2 years ago

No offense, but according to GH netiquette, there is no need to create additional noise for those who subscribed by posting "me to" posts. A better way is to just react to the post with emoji. You automatically get subscription, subscribers aren't notified, maintainer or volunteer gets imagination of affected users count, i.e. importance of the problem.

andrewc12 commented 2 years ago

Yes that is a great rule. Thankfully it doesn't apply here because my post was demonstrating that the hostid mismatch is not related to the l2arc disk.

The hostid is surprising as well, but could be related to the wrong l2arc disk. It is supposed to make up a hostid the very first time you run ZFS, and store it in Registry. After that, it uses that Registry value.

Looking back I didn't explicitly mention that in my post.

lundman commented 2 years ago

OK so hmm when I try, it sets the hostid ok:

SPL: created hostid 0x270d2b22

and after a reboot:

spl_check_assign_types: kstat 'hostid': 0x0 -> 0x270d2b22

and confirmed the same in RegEdit. import was no issue.

I wonder why it fails for you - you guys checked what the value is in registry?

lundman commented 2 years ago

So we open the device, and call:

    BOOL ret = DeviceIoControl(hDevice,
        IOCTL_STORAGE_GET_DEVICE_NUMBER, NULL, 0,
        (LPVOID)device_number, (DWORD)sizeof (*device_number),
        (LPDWORD)&returned, (LPOVERLAPPED)NULL);

The harddisk number is correct, but partition number is always 0. I guess I am supposed to do something else to get partition number....

demonfoo commented 2 years ago

I'm getting the same complaint with OpenZFSOnWindows-debug-2.1.99-1005-g5d39a0e3d-dirty.exe running on Windows 10 21H2 (build 19044.1766) in VirtualBox, but with a pool I'd just created on 3 100 GB virtual disks.

C:\WINDOWS\system32>zpool.exe status
  pool: bucket
 state: ONLINE
status: Mismatch between pool hostid and system hostid on imported pool.
        This pool was previously imported into a system with a different hostid,
        and then was verbatim imported into this system.
action: Export this pool on all systems on which it is imported.
        Then import it to correct the mismatch.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
config:

        NAME                STATE     READ WRITE CKSUM
        bucket              ONLINE       0     0     0
          raidz1-0          ONLINE       0     0     0
            physicaldrive1  ONLINE       0     0     0
            physicaldrive2  ONLINE       0     0     0
            physicaldrive3  ONLINE       0     0     0

errors: No known data errors

I did a zpool export and a zpool import, and the warning persists. It otherwise seems to function as expected. It's not been tested significantly, though - no data has yet been written to it.

Late addition: After the VM restarted automatically for updates, the pool did not auto-import, and when I tried to do a zpool import on it, I got:

C:\WINDOWS\system32>zpool import bucket
path '\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000100#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive1'
read partitions ok 2
    gpt 0: type ae8fdba0 off 0x100000 len 0x18ff600000
    gpt 1: type ae8fdba0 off 0x18ff700000 len 0x800000
asking libefi to read label
EFI read OK, max partitions 9
    part 0:  offset 800:    len c7fb000:    tag: 4    name: 'zfs-000018d3000011c9'
    part 8:  offset c7fb800:    len 4000:    tag: b    name: ''
path '\\?\scsi#disk&ven_vbox&prod_harddisk#4&1c07620b&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive3'
read partitions ok 4
    mbr 0: type 7 off 0x100000 len 0x22500000
    mbr 1: type 7 off 0x22600000 len 0x18bcb2d000
    mbr 2: type 27 off 0x18df200000 len 0x20c00000
    mbr 3: type 0 off 0x0 len 0x0
asking libefi to read label
path '\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive0'
read partitions ok 2
    gpt 0: type ae8fdba0 off 0x100000 len 0x18ff600000
    gpt 1: type ae8fdba0 off 0x18ff700000 len 0x800000
asking libefi to read label
EFI read OK, max partitions 9
    part 0:  offset 800:    len c7fb000:    tag: 4    name: 'zfs-000006e800003652'
    part 8:  offset c7fb800:    len 4000:    tag: b    name: ''
path '\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000200#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive2'
read partitions ok 2
    gpt 0: type ae8fdba0 off 0x100000 len 0x18ff600000
    gpt 1: type ae8fdba0 off 0x18ff700000 len 0x800000
asking libefi to read label
EFI read OK, max partitions 9
    part 0:  offset 800:    len c7fb000:    tag: 4    name: 'zfs-00002d6a00004f2a'
    part 8:  offset c7fb800:    len 4000:    tag: b    name: ''
Processing volume '\\?\Volume{06a92427-0000-0000-0000-100000000000}'
Processing volume '\\?\Volume{06a92427-0000-0000-0000-602200000000}'
Processing volume '\\?\Volume{06a92427-0000-0000-0000-20df18000000}'
Processing volume '\\?\Volume{a8b35cee-3098-11e9-8e87-806e6f6e6963}'
working on dev '#1048576#107363696640#\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
setting path here '/dev/physicaldrive0'
setting physpath here '#1048576#107363696640#\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
working on dev '#1048576#107363696640#\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000100#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
setting path here '/dev/physicaldrive1'
setting physpath here '#1048576#107363696640#\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000100#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
working on dev '#1048576#107363696640#\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000200#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
setting path here '/dev/physicaldrive2'
setting physpath here '#1048576#107363696640#\\?\scsi#disk&ven_vbox&prod_harddisk#4&26cc4aa&0&000200#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
cannot import 'bucket': pool was previously in use from another system.
Last accessed by Windows (hostid=1) at Fri Jul  1 10:42:36 2022
The pool can be imported, use 'zpool import -f' to import the pool.

hostid=1? That seems... odd?

datacore-rm commented 2 years ago

Afer reboot, the 'spl_hostid' variable value is default assigned as 1. The setting of the correct hostid value later in windows_kstat_update() is excluded from compilation in commit https://github.com/openzfsonwindows/openzfs/commit/6b6d9ad00550a5c4dc6b243e8016897fc4a27605#diff-64b824b5581a686ef8512c31ab08f5f159633c6f47ee7c9fcf79c691f9cc806c ?

_windows_kstat_update(rw = KSTAT_WRITE)
{

if 0

... spl_hostid = ks->win32_hw_hostid.value.ui32; ...

endif

}_

lundman commented 2 years ago

It is supposed to be set by the new tunables framework, and have a manual TUNABLE define for it... lemme check

lundman commented 2 years ago

Ah yeah that is missing https://github.com/openzfsonosx/openzfs/blob/4412af1dd1b3ebd2c1abade34c75b9ceeb4ce424/module/os/linux/spl/spl-generic.c#L58-L59 (macos as that repo is searchable)

We need to do that define, and a few more actually, similar to https://github.com/openzfsonosx/openzfs/blob/fb666a6e3870fadfbd8b3263c853854415a4411a/module/os/macos/zfs/sysctl_os.c#L840-L867

I'll get onto fixing that

datacore-rm commented 2 years ago

Thank you...

lundman commented 2 years ago

the hostid change has been done and cleaned up, just need to change over the zfs_vdev_protection_filter and version settings as well, as good examples.

lundman commented 2 years ago

OK took a little longer than expected. Converted the three tunables we have to the new style:

https://github.com/openzfsonwindows/openzfs/commit/f21d5c01870868840da47445d2334105762b6388#diff-898f0b1037d4d227b446c1552d4310ca858ebe06d2d890edfb7097bf72a2664cR74-R85

As examples on how to do them. I've tested that when the filter is set in registry, it is seen in vdev_disk, so I think it should be OK.

datacore-rm commented 2 years ago

hostid issue seems to be resolved. Thanks.

There is mismatch in datatype in 'zfs_vdev_protection_filter' definiton/declaration. If I default assign some other value in definition, it doesn't seem to update correct value in registry.

lundman commented 2 years ago

Ah if you set it in the code, make the type STATIC?

lundman commented 2 years ago

Ah it is STATIC, but I left the L"\0" in there, it is supposed to be ASCII now.