openzfsonwindows / openzfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
447 stars 15 forks source link

[2.2.6-rc1 bug] Existing datasets "Wrong Parameter" and "Permission Denied" error #397

Open thesn10 opened 1 week ago

thesn10 commented 1 week ago

I have existing datasets, created in rc6, and i upgraded to 2.2.6-rc1 and cannot access them anymore.

Double clicking on a dataset folder gives permission error (that cannot be fixed by claiming permissions):

image

Translated: image

Trying to cd into dataset gives "Invalid parameter" error:

Z:\>cd dataset
Invalid Parameter.

Using powershell, cd works but it shows the error after dir:

PS Z:\> cd dataset
PS Z:\dataset> dir
Get-ChildItem: Invalid Parameter. : 'Z:\dataset'

When using Filespy on the Explorer double click, there are multiple STATUS_INVALID_PARAMETER errors: image

When using Filespy on cd, it seems to fail at the first STATUS_REPASE: image

Also, zfs mount does only show one of my two datasets. I have a dataset named "dataset" and one named "userfiles". The latter does not even appear, and the other does not show the drive letter:

> zfs mount

tank                            Z:/
tank/dataset                    tank/dataset

zpool import tank output:

> zpool import tank
path '\\?\scsi#disk&ven_&prod_st16000nm000h-3k#5&1bc941f&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
 and '\\?\PhysicalDrive0'
read partitions ok 2
    gpt 0: type c4ccfd510 off 0x100000 len 0xe8d7f600000
    gpt 1: type c4ccfd510 off 0xe8d7f700000 len 0x800000
asking libefi to read label
EFI read OK, max partitions 9
    part 0:  offset 800:    len 746bfb000:    tag: 4    name: 'zfs-00003c1f00003374'
    part 8:  offset 746bfb800:    len 4000:    tag: b    name: ''
working on dev '#1048576#16000890175488#\\?\scsi#disk&ven_&prod_st16000nm000h-3k#5&1bc941f&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
setting path here '/dev/physicaldrive0'
setting physpath here '#1048576#16000890175488#\\?\scsi#disk&ven_&prod_st16000nm000h-3k#5&1bc941f&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}'
thesn10 commented 1 week ago

Also unmount failed saying "tank is busy"

When i shutdown my pc, i got a BSOD saying PNP_DETECTED_FATAL_ERROR

lundman commented 1 week ago

OK let's sum up, you have a dataset volume mount reparsepoint, which gives Security problems when you try to access it. If you were to create a new pool (temporary empty file pool is fine) does it happen? It is to answer if it is the Security stored on your pool, or, do we always make wrong Security.

  1. The STATUS_INVALID is for method 37 - something we haven't implemented, and is most likely not the issue. Can you confirm with mountvol that the OS has everything mounted that you expect?

  2. Tank is busy is a known issue, you can sometimes get around it by unmounting manually, one-by-one from deepest. Seems to be an issue with the "collected everything mounted" code.

thesn10 commented 1 week ago

I got some more info about the BSOD i got. The PNP_DETECTED_FATAL_ERROR has the following params:

Error code: 0x000000ca 
param1: 0x0000000000000004
param2: 0xffffe70a8f5a0e00
param3: 0x0000000000000000
param4: 0x0000000000000000

If i look at the error code documentation, we see our param1 is 0x4, which means that we enumerated a PDO which was previously deleted using IoDeleteDevice. So somewhere the zfs driver accesses a deleted PDO

image

thesn10 commented 1 week ago

OK let's sum up, you have a dataset volume mount reparsepoint, which gives Security problems when you try to access it. If you were to create a new pool (temporary empty file pool is fine) does it happen? It is to answer if it is the Security stored on your pool, or, do we always make wrong Security.

  1. The STATUS_INVALID is for method 37 - something we haven't implemented, and is most likely not the issue. Can you confirm with mountvol that the OS has everything mounted that you expect?
  2. Tank is busy is a known issue, you can sometimes get around it by unmounting manually, one-by-one from deepest. Seems to be an issue with the "collected everything mounted" code.

Unfortunately I cannot create a new pool to test, as i dont have any spare drive. But i can show you the mountvol output:

mountvol: (Z:\ is the zpool)

\\?\Volume{b2ebfafb-4cef-47ea-a357-c5ec53d88fcf}\
        E:\

    \\?\Volume{30c96a1e-1837-4960-b213-b0dd68e8def3}\
        C:\

    \\?\Volume{0bbb1c3d-1502-4a81-a103-2c7235749d5f}\
        *** NO MOUNT POINTS ***

    \\?\Volume{7fa68025-d442-43c7-9600-c502e106723c}\
        *** NO MOUNT POINTS ***

    \\?\Volume{ff3047ab-8539-44bd-8256-69dc4192e079}\
        *** NO MOUNT POINTS ***

    \\?\Volume{190625e6-6c32-11ef-8dda-ec086b081ef1}\
        G:\

    \\?\Volume{9b1c0123-30c0-11ef-8d81-ec086b081ef1}\
        Z:\

    \\?\Volume{9b1c056b-30c0-11ef-8d81-ec086b081ef1}\
    *** CANNOT BE PROVISIONED UNTIL A VOLUME MOUNT POINT HAS BEEN CREATED ***

So the Z:\dataset mountpoint is missing and it shows an error instead

dir shows completey different volume ids:

PS Z:\> dir

    Directory: Z:\

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
l----          23.06.2024    00:45                dataset -> Volume{1c14792a-a278-3dbe-a5ab-9e78981503d2}
d----          23.06.2024    02:42                folder
l----          22.06.2024    22:58                userfiles -> Volume{5447f2a6-1f62-36a3-9dad-c64a15afa84e}

The wmiobject does not show capacity or filesystem for zfs:

Z:\> get-wmiobject Win32_Volume | ft @{Label = 'Mount'; Expression = {$_.Name -replace '^\\.+', ''}}, Capacity, FileSystem, Label, DeviceID

Mount      Capacity FileSystem Label        DeviceID
-----      -------- ---------- -----        --------
E:\   3790935683072 NTFS       4TB          \\?\Volume{b2ebfafb-4cef-47ea-a357-c5ec53d88fcf}\
C:\    499303575552 NTFS                    \\?\Volume{30c96a1e-1837-4960-b213-b0dd68e8def3}\
          628092928 NTFS                    \\?\Volume{0bbb1c3d-1502-4a81-a103-2c7235749d5f}\
          536854528 FAT32                   \\?\Volume{7fa68025-d442-43c7-9600-c502e106723c}\
          153092096 FAT32                   \\?\Volume{ff3047ab-8539-44bd-8256-69dc4192e079}\
G:\     16106127360 FAT32      Google Drive \\?\Volume{190625e6-6c32-11ef-8dda-ec086b081ef1}\
Z:\                                         \\?\Volume{9b1c0123-30c0-11ef-8d81-ec086b081ef1}\
                                            \\?\Volume{9b1c056b-30c0-11ef-8d81-ec086b081ef1}\
guenther-alka commented 1 week ago

Creating a testpool without physical disks

- open disk management (Datenträgerverwaltung) and create a virtual harddisk vhdx with dynamic size (real size depends on content) under menu Action/ Aktion
- This creates and provides a virtual harddisk that you can use like a real disk (for a ZFS pool)
- If you use my napp-it cs ZFS web-gui you can manage Windows (Virtual HardDisks, Storage Spaces and ZFS) via web-GUI

see also https://github.com/openzfsonwindows/openzfs/discussions

lundman commented 1 week ago

Yeah, so you can just create a VHDX disk, even just a 1g, dynamic, will get you a PHYSICALDISKx to create a pool on. This is the Windows way. But you can also make a sparse file:

# fsutil file createnew C:\poolfile.bin 200000000
# zpool.exe create tank \\?\C:\poolfile.bin

but honestly, just make a vhdx.

But, the PNP at unload I have come across - so I can dig into here.

guenther-alka commented 1 week ago

The vhdx method (inherited from Hyper-V) is very flexible and very fast. It is a good idea to place all .vhdx files in the same folder ex c:\vhdx as you need to find and provide them as disks via a Powershell command after a reboot (can be a cmd file/planned task together with a pool import)

If you SMB share the folder, you can even provide a .vhdx as a disk over LAN. I have added this as a ZFS cluster method in my web-gui for a network mirror or Raid-Z even for Windows 11. With Windows Server methods like SMB direct/RDMA and nics > 10G, such a disk or SMB share over LAN is nearly as fast as a local disk even when using NVMe.

if ZFSonWindows becomes a little more stable I would say it has the power to be the most attractive ZFS plattform especially due SMB (direct) and ACL handling that is a pain with Linux and SAMBA and not available at all or not as fast.

thesn10 commented 1 week ago

@lundman I have created a fresh pool and dataset and it works fine, no errors. But there seems to be a difference to migrated pools (from 2.2.3-rc6 to 2.2.6-rc1) in how the datasets are handled

Freshly created pool mounts fine:

> mountvol

    \\?\Volume{80a3a563-6dcb-11ef-8ddc-ec086b081ef1}\
        D:\

    \\?\Volume{80a3a569-6dcb-11ef-8ddc-ec086b081ef1}\
        D:\dataset\

Migrated pool fails to mount dataset:

> mountvol

    \\?\Volume{9b1c0123-30c0-11ef-8d81-ec086b081ef1}\
        Z:\

    \\?\Volume{9b1c056b-30c0-11ef-8d81-ec086b081ef1}\
    *** CANNOT BE PROVISIONED UNTIL A VOLUME MOUNT POINT HAS BEEN CREATED ***

image

image image

lundman commented 1 week ago

Oh that is interesting. The main difference there is probably the property driveletter - a migrated pool wouldn't have the property at all. The code is supposed to pretend a non-existing driveletter to be the same as default, but maybe that goes wrong.

lundman commented 1 week ago

OK that seems to confirm a problem:

NAS$ mkfile -n 1g /var/tmp/pool
NAS$ zpool create DOOM /var/tmp/pool
NAS$ zfs create DOOM/dataset
NAS$ zpool export DOOM
NAS$ tar -cSf pool.tar /var/tmp/pool
NAS$ scp pool.tar vm:

vm$ tar -xSf pool.tar
vm$ zpool import -d `pwd` DOOM
vm$ zfs mount DOOM
cannot mount 'DOOM': Unknown error

OK, I'll need some time to debug this

lundman commented 3 days ago

OK so I straight up ignored SecurityDecriptor passed along with mkdir, and didn't merge with parent for when creating files. Please try rc4 and see if it has improved.

thesn10 commented 2 days ago

@lundman

OK so I straight up ignored SecurityDecriptor passed along with mkdir, and didn't merge with parent for when creating files. Please try rc4 and see if it has improved.

Installing rc4:

While installing rc4, i got a BSOD saying SYSTEM_THREAD_EXCEPTION_NOT_HANDLED in OpenZFS.sys. The exception code is STATUS_ACCESS_VIOLATION (0xc0000005)

I retried installing it again and it was successfull.

Testing mirgated pool on rc4:

It has improved a little bit on rc4:

But thats about it. The other errors have not improved:

> mountvol

    \\?\Volume{9b1c0123-30c0-11ef-8d81-ec086b081ef1}\
        Z:\

    \\?\Volume{9b1c056b-30c0-11ef-8d81-ec086b081ef1}\
    *** CANNOT BE PROVISIONED UNTIL A VOLUME MOUNT POINT HAS BEEN CREATED ***

Setting driveletter again

Oh that is interesting. The main difference there is probably the property driveletter - a migrated pool wouldn't have the property at all. The code is supposed to pretend a non-existing driveletter to be the same as default, but maybe that goes wrong.

I had set the driveletter to Z: on 2.2.3-rc6 but it seems to got lost in the migration:

> zfs get driveletter tank
bad property list: invalid property 'driveletter'

I tried setting the driveletter again and got this weird behaivior:

>zfs set driveletter=Z tank
>zpool export tank
>zpool import tank
>zfs mount
tank                            Z:/
tank/dataset                    //./Volume{0d99a5ba-7291-11ef-8de6-ec086b081ef1}//tank/dataset
tank/userfiles                  //./Volume{0d99a5ba-7291-11ef-8de6-ec086b081ef1}//tank/userfiles

Now my datasets are completely missing in file explorer

lundman commented 1 day ago

OK interesting.

bad property list: invalid property 'driveletter'

Surprising - as if the property isnt defined in the code. I will check that.

Could you reboot, and "import -N tank" to stop it from mounting, then mount one at a time, starting with zfs mount tank. If it goes wrong, save cbuf for me. If it works, keep going...

//./Volume{0d99a5b ... just means mount failed.

thesn10 commented 1 day ago

@lundman I dont know how to save cbuf for a running zfs driver, as i am new to WinDbg. How do i attach to a running zfs driver? I am able to save cbuf for a bsod dump, but thats about it.

So, i rebooted and I executed:

zpool import -N tank
zfs mount tank
zfs mount tank/dataset
zfs mount tank/userfiles

No error, the commands where successful. mountvol still shows invalid mounts.

I then executed zpool export tank and i got a BSOD.

Here is the bsod info: info.txt cbuf.txt

This is the cbuf for the bsod after the zpool export, but how do i save the cbuf for a running zfs driver without a bsod?