openzfsonwindows / openzfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
475 stars 18 forks source link

driveletter assignment doesn't work #135

Open inoperable opened 2 years ago

inoperable commented 2 years ago

Microsoft changed something (as usual) and I don't see my previously perfectly working datasets anymore... Previous build worked (i'm not 100% sure though, but the one before sure did). Any ideas? I had 0 issues in the last 12 months - and now this s**t :-(

Using latest windows insiders I got this:

Microsoft Windows [Version 10.0.25193.1000]
(c) Microsoft Corporation. All rights reserved.

C:\Users\zero>zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
stuff  1.36T   963G   429G        -         -      -    69%  1.00x    ONLINE  -

C:\Users\zero>zfs list
NAME    USED  AVAIL     REFER  MOUNTPOINT
stuff   963G   386G      963G  /vol/stuff

C:\Users\zero>zfs mount -a
cannot mount 'stuff': Unknown error

C:\Users\zero>
inoperable commented 2 years ago
C:\Users\zero>zfs get driveletter stuff
NAME   PROPERTY     VALUE        SOURCE
stuff  driveletter  z            local

C:\Users\zero>z:
The system cannot find the drive specified.

C:\Users\zero>
inoperable commented 2 years ago

Thought I might be sneaky, tried this out:

C:\>zfs set driveletter= stuff

C:\>zfs set driveletter=x stuff

C:\>zfs mount

C:\>zfs mount stuff
cannot mount 'stuff': Unknown error

No-go either :-(

andrewc12 commented 2 years ago

Sorry to hear about this. Try to try and help out the dev would I be able to collect some info about your system?

Do you know what version of openzfsonwindows you installed? What is the build number for the build of windows?

inoperable commented 2 years ago

Sure, I'm always running the latest possible, meaning I go with the one linked on https://openzfsonosx.org/wiki/Windows_builds @lundman is doing pretty great job keeping windows suck less - considering m$ does everything to suck more. Usually, the rolling releases solve my issues 9 times out of 10 and do work nicely with preview releases I (have to) use. Guess m$ decided to go tits up as usual...

inoperable commented 2 years ago

C:\>zfs --version
zfs-2.0.0-2
zfs-kmod-(registry lookup failed)

C:\>zpool --version
zfs-2.0.0-2
zfs-kmod-(registry lookup failed)

C:\>ver

Microsoft Windows [Version 10.0.25193.1000]

C:\>
inoperable commented 2 years ago

🤔 Hmm... there is some odd thing going on, after some reboots in between all of the sudden I can see one of the imported dataset with the proper driveletter assigned - as if nothing was wrong before.

O:\>zfs get driveletter
NAME            PROPERTY     VALUE        SOURCE
data            driveletter               local
data/backup     driveletter  r            local
data/home       driveletter  h            local
data/stash      driveletter  -            default
data/stash/git  driveletter  -            default
data/stash/src  driveletter  -            default
data/vault      driveletter  -            default
drop            driveletter  o            local
music           driveletter  m            local
stuff           driveletter  Z            local

Previously, all of those with letter assigned worked without any issues - now only the music dataset has a drive letter assigned, and one I can change to and use the dataset like any other windows drive.

lundman commented 2 years ago

There was some fixes here, and also, it's not instant, "zfs umount $dataset" "zfs mount $dataset" is needed after change

Keep an eye on it, make sure it doesn't come back

inoperable commented 2 years ago

Hi @lundman, thanks for taking a look here. Could you please elaborate a bit on your comment? I'm not sure I do understand it (been a while since I dig into zfs/win and it worked most of the time)

Thanks in advance. Best

inoperable commented 2 years ago

@lundman: Installed latest build: 2.1.99-1022-ga8162d01f-dirty, after I configured the mountpoint property to a path I normally use on non-ms systems, as in zfs set mountpoint=/vol/music music and then executing zfs mount -a which normally would give me error on windows - it worked.

Assumption: IF your dataset's driveletter property is configured, but mountpoint property is set to legacy (or none - need to verify this though) and the mounting is being handled by the OS instead of zfs directly - driveletter won't be properly assgined (uhmm mounted?) OR this has been fixed in the latest build.

I'm 95% sure though, that my attempts failed because of datasets mountpoint property being set to legacy Will reinstate

inoperable commented 2 years ago
C:\Users\zero>ver

Microsoft Windows [Version 10.0.25197.1000]

C:\Users\zero>zfs list
NAME    USED  AVAIL     REFER  MOUNTPOINT
drop    331G   118G      331G  /vol/drop
music  77.5G   147G     77.5G  /vol/music
stuff   966G   383G      966G  /vol/stuff

C:\Users\zero>zfs --version
zfs-2.0.0-2
zfs-kmod-(registry lookup failed)

C:\Users\zero>zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
drop    464G   331G   133G        -         -     0%    71%  1.00x    ONLINE  -
music   232G  77.5G   155G        -         -      -    33%  1.00x    ONLINE  -
stuff  1.36T   966G   426G        -         -      -    69%  1.00x    ONLINE  -

C:\Users\zero>zpool --version
zfs-2.0.0-2
zfs-kmod-(registry lookup failed)

C:\Users\zero>zfs get driveletter music
NAME   PROPERTY     VALUE        SOURCE
music  driveletter  m            local

C:\Users\zero>m:

M:\>ls -alh
total 192K
drwxr-xr-x 18 SYSTEM SYSTEM 0 Sep  9 21:32  .
drwxr-xr-x  1 SYSTEM SYSTEM 0 Sep  9 07:15  ..
drwxr-xr-x  2 SYSTEM SYSTEM 0 Sep  9 21:32 'System Volume Information'
drwxr-xr-x 22 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x  4 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 17 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 32 SYSTEM SYSTEM 0 Apr 30 11:37  [redacted]
drwxr-xr-x 13 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 15 SYSTEM SYSTEM 0 Jul 20  2020  [redacted]
drwxr-xr-x 29 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 13 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 21 SYSTEM SYSTEM 0 Mar  6  2021  [redacted]
drwxr-xr-x  9 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 39 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x 12 SYSTEM SYSTEM 0 Mar  6  2021  [redacted]
drwxr-xr-x  8 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]
drwxr-xr-x  8 SYSTEM SYSTEM 0 May  1 05:03  webrips
drwxr-xr-x  5 SYSTEM SYSTEM 0 Oct 14  2020  [redacted]

M:\>:-) PROFIT!
inoperable commented 2 years ago

UPDATE: booted today to the exact same system and now the same problem resurfaced again, meaning I can't mount the datasets. I tried to set the mountpoint property back and forth, and also driveletter - this did not helped much.

C:\Users\zero>zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
drop    464G   331G   133G        -         -     0%    71%  1.00x    ONLINE  -
music   232G  77.5G   155G        -         -    17%    33%  1.00x    ONLINE  -
stuff  1.36T  1000G   392G        -         -      -    71%  1.00x    ONLINE  -

C:\Users\zero>zfs list
NAME    USED  AVAIL     REFER  MOUNTPOINT
drop    331G   118G      331G  /vol/drop
music  77.5G   147G     77.5G  /vol/music
stuff  1000G   349G     1000G  /vol/stuff

C:\Users\zero>zfs mount music
cannot mount 'music': Unknown error

C:\Users\zero>zfs set mountpoint=legacy music

C:\Users\zero>zfs set mountpoint=/vol/music music
cannot mount 'music': Unknown error
property may be set but unable to remount filesystem

C:\Users\zero>zfs set mountpoint=/vol/music music

C:\Users\zero>m:
The system cannot find the drive specified.

C:\Users\zero>zfs set mountpoint=legacy music

C:\Users\zero>zpool export music
cannot export 'music': pool is busy

C:\Users\zero>zpool export -f music
cannot export 'music': pool is busy

C:\Users\zero>

In short, the assumption that zfs set mountpoint=... has to be changed in order to mount from legacy to a proper path is simply wrong. Any ideas?

lundman commented 2 years ago

Could be worth to check if the mount point (directory) already exist in the target dir

inoperable commented 2 years ago

Something went terribly wrong - I was fiddling with the windows driver, and run zpool upgrade on a dataset I not really need for experimentation. The upgrade went ok, still couldn't "mount" the dataset in Windows but now after rebooting to Linux, zpool causes panics on import :-(

I tried to to boot into bsd and take a look from there - but freebsd 13.1 live memstick img instantly crashes on import and reboots. I think I might irreversible damaged my not-so-important music dataset (Duhh...)

> # zpool import music
cannot import 'music': pool was previously in use from another system.
Last accessed by Windows (hostid=3696af32) at Sat Sep 10 12:01:08 2022
The pool can be imported, use 'zpool import -f' to import the pool.

OK, then lets try this again with -N and -f flags

root@rectifier /home/zero                                                                                                                                                                                                                           [18:24:37]
> # zpool import -N -f music
internal error: cannot import 'music': Invalid exchange
[1]    17869 IOT instruction (core dumped)  zpool import -N -f music

Duhh...

[  717.333001] VERIFY3(0 == zap_lookup(spa->spa_meta_objset, spa->spa_feat_enabled_txg_obj, feature->fi_guid, sizeof (uint64_t), 1, res)) failed (0 == 52)
[  717.333005] PANIC at zfeature.c:293:feature_get_enabled_txg()
[  717.333007] Showing stack for process 22926
[  717.333008] CPU: 7 PID: 22926 Comm: zpool Tainted: P           OE     5.19.8-1-MANJARO #1 982dc6401bc44ace2821efa3ed570326bc5c6ce2
[  717.333011] Call Trace:
[  717.333013]  <TASK>
[  717.333014]  dump_stack_lvl+0x48/0x60
[  717.333018]  spl_panic+0xf4/0x10c [spl 33bdcd88547a4a910b30f8ab73d9b9d9213e153e]
[  717.333026]  ? zap_lookup+0x56/0x100 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333097]  spa_feature_enabled_txg+0x11f/0x130 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333161]  traverse_impl+0x39c/0x4a0 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333218]  ? spa_vdev_err+0x40/0x40 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333281]  ? tsd_hash_dtor+0x73/0x90 [spl 33bdcd88547a4a910b30f8ab73d9b9d9213e153e]
[  717.333287]  traverse_pool+0x68/0x1e0 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333343]  ? spa_vdev_err+0x40/0x40 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333402]  ? spa_vdev_err+0x40/0x40 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333460]  ? zio_root+0x33/0x40 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333525]  spa_load+0x15a3/0x1820 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333587]  spa_load_best+0x54/0x2c0 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333646]  spa_import+0x23d/0x870 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333711]  zfs_ioc_pool_import+0x15b/0x180 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333777]  zfsdev_ioctl_common+0x8d6/0xa00 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333840]  zfsdev_ioctl+0x53/0xe0 [zfs 9915128fb089bf68267582b39a5f9a82546bc96a]
[  717.333901]  __x64_sys_ioctl+0x91/0xd0
[  717.333904]  do_syscall_64+0x5c/0x90
[  717.333906]  ? exit_to_user_mode_prepare+0x145/0x1d0
[  717.333910]  ? syscall_exit_to_user_mode+0x1b/0x40
[  717.333911]  ? do_syscall_64+0x6b/0x90
[  717.333913]  ? syscall_exit_to_user_mode+0x1b/0x40
[  717.333914]  ? do_syscall_64+0x6b/0x90
[  717.333916]  ? ksys_read+0xd8/0xf0
[  717.333917]  ? syscall_exit_to_user_mode+0x1b/0x40
[  717.333919]  ? do_syscall_64+0x6b/0x90
[  717.333920]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  717.333923] RIP: 0033:0x7fe4058fb9ef
[  717.333924] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  717.333925] RSP: 002b:00007ffeeb5ac930 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  717.333927] RAX: ffffffffffffffda RBX: 0000556c88d8d4e0 RCX: 00007fe4058fb9ef
[  717.333928] RDX: 00007ffeeb5ac9f0 RSI: 0000000000005a02 RDI: 0000000000000003
[  717.333929] RBP: 00007ffeeb5b08e0 R08: 000000000000b490 R09: 0000556c88d93fd0
[  717.333930] R10: 00007fe4059d7c10 R11: 0000000000000246 R12: 00007ffeeb5ac9f0
[  717.333931] R13: 0000556c88dc35f0 R14: 0000556c88dc4e60 R15: 0000000000000000
[  717.333932]  </TASK>

could it be that spa->spa_feat_enabled_txg_obj is making zfs crash? I knew it that enabling feats is generally a bad idea, but you know how it is. If there is a red button...

NOTE: the traceback above is not from zpool import but from zdb

Doing zpool import -o readonly=on -F -m -N -s -f music makes the system freeze in place and any calls to zfs/zpool commands are freezing.

inoperable commented 2 years ago

Could be worth to check if the mount point (directory) already exist in the target dir

There is no mountpoint directory created on %SYSTEMDRIVE% - ergo C: in my case, It used to create the mountpoint paths on the C:\ drive at some point in the previous version of the driver. That was a long time ago though, min. 12 Months+ and I haven't seen any of those directories created on import ever since.

inoperable commented 2 years ago

Below the coredump debug output from on linux while trying to import the pool with -f flag

           PID: 3834 (zpool)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Sat 2022-09-10 21:22:17 CEST (14s ago)
  Command Line: zpool import -f music
    Executable: /usr/local/sbin/zpool
 Control Group: /user.slice/user-0.slice/session-5.scope
          Unit: session-5.scope
         Slice: user-0.slice
       Session: 5
     Owner UID: 0 (root)
       Boot ID: 0aff0b77d8ca4a05b3e8140da7297fe8
    Machine ID: f51146b12eac425b9583b5601a060e27
      Hostname: rectifier
       Storage: /var/lib/systemd/coredump/core.zpool.0.0aff0b77d8ca4a05b3e8140da7297fe8.3834.1662837737000000.zst (present)
     Disk Size: 242.6K
       Message: Process 3834 (zpool) of user 0 dumped core.

                Module linux-vdso.so.1 with build-id 430d2b2a716c9d0c0f48b05d81933bf2a2372edd
                Module libresolv.so.2 with build-id 6e92fdf0a76152939e57a562fd539c86076b5728
                Module libkeyutils.so.1 with build-id ac405ddd17be10ce538da3211415ee50c8f8df79
                Module libkrb5support.so.0 with build-id 15f223925ef59dee4379ebbc0fcd14eda9ba81a2
                Module libcom_err.so.2 with build-id 3360a28740ffbbd5a5c0c21d09072445908707e5
                Module libk5crypto.so.3 with build-id cc77a742cb62447a53d98285b41558b8acd92866
                Module libkrb5.so.3 with build-id 371cc767dacb17cb42c9c44b88eebbed5ee9a756
                Module libgcc_s.so.1 with build-id 85db482c4585a328d95ec41124337a967bb24d8f
                Module libgssapi_krb5.so.2 with build-id 292f1ce32161c0ecc4a287bc8494d5d7c420a03f
                Module ld-linux-x86-64.so.2 with build-id da64753d57bf3801827448f53d911b041568e727
                Module libc.so.6 with build-id 9c28cfc869012ebbd43cdb0f1eebcd14e1b8bdd8
                Module libuuid.so.1 with build-id 9057a530e6b3b0e71f24602a0039c490c9a0b5a1
                Module libblkid.so.1 with build-id fb2c5d3c17aac74758a3eb80a2bc1c16bcf183b1
                Module libm.so.6 with build-id 0b8d43ea2dae21a1c5e44c3f0a9dc2fb292d27c0
                Module libudev.so.1 with build-id 1c5c93f27f6b1fcbf2af1d56079e945779b8c097
                Module libz.so.1 with build-id fefe3219a96d682ec98fcfb78866b8594298b5a2
                Module libtirpc.so.3 with build-id 8f5e329f75d897df033d761ca1f742f90619c1b6
                Module libnvpair.so.3 with build-id f5647212bfd3d180c0aab0d406a5f233207ba974
                Module libcrypto.so.1.1 with build-id 7981ea3d69f3c28e46ee312a815af96eab93775c
                Module libuutil.so.3 with build-id cc4029b37010c2076ec9f29a98f7ae00566a261d
                Module libzfs_core.so.3 with build-id 34408921c2ec9f56ffed14498b5c95095ce5339c
                Module libzfs.so.4 with build-id b4f436d72df350ec120e8a40f1403b84968891df
                Module zpool with build-id 12065bb3916f9412f628bd4cc3133eb2931ca803
                Stack trace of thread 3834:
                #0  0x00007ffb649c04dc n/a (libc.so.6 + 0x884dc)
                #1  0x00007ffb64970998 raise (libc.so.6 + 0x38998)
                #2  0x00007ffb6495a53d abort (libc.so.6 + 0x2253d)
                #3  0x00007ffb6506a25a zfs_verror (libzfs.so.4 + 0x4025a)
                #4  0x00007ffb6506adfe zpool_standard_error_fmt (libzfs.so.4 + 0x40dfe)
                #5  0x00007ffb6505b902 zpool_import_props (libzfs.so.4 + 0x31902)
                #6  0x000055e7542c8bf9 do_import (zpool + 0xebf9)
                #7  0x000055e7542d26c9 import_pools (zpool + 0x186c9)
                #8  0x000055e7542d7589 zpool_do_import (zpool + 0x1d589)
                #9  0x000055e7542c387d main (zpool + 0x987d)
                #10 0x00007ffb6495b2d0 n/a (libc.so.6 + 0x232d0)
                #11 0x00007ffb6495b38a __libc_start_main (libc.so.6 + 0x2338a)
                #12 0x000055e7542c3a35 _start (zpool + 0x9a35)

                Stack trace of thread 3842:
                #0  0x00007ffb649bb346 n/a (libc.so.6 + 0x83346)
                #1  0x00007ffb649bde64 pthread_cond_timedwait (libc.so.6 + 0x85e64)
                #2  0x00007ffb649c7c37 n/a (libc.so.6 + 0x8fc37)
                #3  0x00007ffb649be78d n/a (libc.so.6 + 0x8678d)
                #4  0x00007ffb64a3f8e4 __clone (libc.so.6 + 0x1078e4)

                Stack trace of thread 3843:
                #0  0x00007ffb649bb346 n/a (libc.so.6 + 0x83346)
                #1  0x00007ffb649bde64 pthread_cond_timedwait (libc.so.6 + 0x85e64)
                #2  0x00007ffb649c7c37 n/a (libc.so.6 + 0x8fc37)
                #3  0x00007ffb649be78d n/a (libc.so.6 + 0x8678d)
                #4  0x00007ffb64a3f8e4 __clone (libc.so.6 + 0x1078e4)

                Stack trace of thread 3845:
                #0  0x00007ffb649bb346 n/a (libc.so.6 + 0x83346)
                #1  0x00007ffb649bde64 pthread_cond_timedwait (libc.so.6 + 0x85e64)
                #2  0x00007ffb649c7c37 n/a (libc.so.6 + 0x8fc37)
                #3  0x00007ffb649be78d n/a (libc.so.6 + 0x8678d)
                #4  0x00007ffb64a3f8e4 __clone (libc.so.6 + 0x1078e4)

                Stack trace of thread 3846:
                #0  0x00007ffb649bb346 n/a (libc.so.6 + 0x83346)
                #1  0x00007ffb649bde64 pthread_cond_timedwait (libc.so.6 + 0x85e64)
                #2  0x00007ffb649c7c37 n/a (libc.so.6 + 0x8fc37)
                #3  0x00007ffb649be78d n/a (libc.so.6 + 0x8678d)
                #4  0x00007ffb64a3f8e4 __clone (libc.so.6 + 0x1078e4)

                Stack trace of thread 3841:
                #0  0x00007ffb649bb346 n/a (libc.so.6 + 0x83346)
                #1  0x00007ffb649bde64 pthread_cond_timedwait (libc.so.6 + 0x85e64)
                #2  0x00007ffb649c7c37 n/a (libc.so.6 + 0x8fc37)
                #3  0x00007ffb649be78d n/a (libc.so.6 + 0x8678d)
                #4  0x00007ffb64a3f8e4 __clone (libc.so.6 + 0x1078e4)

                Stack trace of thread 3844:
                #0  0x00007ffb649bb346 n/a (libc.so.6 + 0x83346)
                #1  0x00007ffb649bde64 pthread_cond_timedwait (libc.so.6 + 0x85e64)
                #2  0x00007ffb649c7c37 n/a (libc.so.6 + 0x8fc37)
                #3  0x00007ffb649be78d n/a (libc.so.6 + 0x8678d)
                #4  0x00007ffb64a3f8e4 __clone (libc.so.6 + 0x1078e4)
                ELF object binary architecture: AMD x86-64

GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/sbin/zpool...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `zpool import -f music'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007ffb649c04dc in ?? () from /usr/lib/libc.so.6
[Current thread is 1 (Thread 0x7ffb64783800 (LWP 3834))]
(gdb) where
#0  0x00007ffb649c04dc in ?? () from /usr/lib/libc.so.6
#1  0x00007ffb64970998 in raise () from /usr/lib/libc.so.6
#2  0x00007ffb6495a53d in abort () from /usr/lib/libc.so.6
#3  0x00007ffb6506a25a in zfs_verror (hdl=hdl@entry=0x55e755ec84e0, error=error@entry=2092, fmt=fmt@entry=0x7ffb6509255c "%s", ap=ap@entry=0x7ffd5bd21680) at lib/libzfs/libzfs_util.c:344
#4  0x00007ffb6506adfe in zpool_standard_error_fmt (hdl=hdl@entry=0x55e755ec84e0, error=error@entry=52, fmt=fmt@entry=0x7ffb6509255c "%s") at lib/libzfs/libzfs_util.c:749
#5  0x00007ffb6506b161 in zpool_standard_error (hdl=hdl@entry=0x55e755ec84e0, error=error@entry=52, msg=msg@entry=0x7ffd5bd25290 "cannot import 'music'") at lib/libzfs/libzfs_util.c:619
#6  0x00007ffb6505b902 in zpool_import_props (hdl=0x55e755ec84e0, config=config@entry=0x55e755ed38b0, newname=newname@entry=0x0, props=<optimized out>, props@entry=0x0, flags=flags@entry=2)
    at lib/libzfs/libzfs_pool.c:2193
#7  0x000055e7542c8bf9 in do_import (config=config@entry=0x55e755ed38b0, newname=newname@entry=0x0, mntopts=mntopts@entry=0x0, props=props@entry=0x0, flags=flags@entry=2) at cmd/zpool/zpool_main.c:3189
#8  0x000055e7542d26c9 in import_pools (pools=pools@entry=0x55e755ed0640, props=0x0, mntopts=mntopts@entry=0x0, flags=flags@entry=2, orig_name=0x55e755ecf9c0 "music", new_name=new_name@entry=0x0, 
    do_destroyed=<optimized out>, pool_specified=<optimized out>, do_all=<optimized out>, import=<optimized out>) at cmd/zpool/zpool_main.c:3317
#9  0x000055e7542d7589 in zpool_do_import (argc=<optimized out>, argv=<optimized out>) at cmd/zpool/zpool_main.c:3803
#10 0x000055e7542c387d in main (argc=4, argv=0x7ffd5bd290a8) at cmd/zpool/zpool_main.c:10911
(gdb) exit
lundman commented 2 years ago

Could be worth settinv zfs_flags=1 to see if it is a soft problem. Does music use crypto? What system did you run zpool upgrade on? Windows?

The mountpoint thing that we create a RP-directory when you mount it, and it is supposed clean it up when unmounted, so if it is still there next time

inoperable commented 2 years ago

@lundman no encryption, pretty much standard pool, one partition, entire drive, linux exclusive feats disabled (project, quota), single device, no cache or anything extra.

I run the zpool upgrade on windows with the latest driver installed.

lundman commented 2 years ago

OK we should probably make a ticket about zpool upgrade, is that is suspect

inoperable commented 2 years ago

@lundman Yeah, I crippled the dataset with the upgrade, now I can only partially recover data with UFSExplorer. Do you need any additional info from me for the ticket - because I won't be able to properly describe the problem, my C fluency is mediocre at best - and knowledge of ZFS internals a lot less then that. Sorry about that.

Any tips on how to restore/recover? Would be a good exercise for the future - since I never had a crippled dataset before :-) UFSExplorer seems to be digging data properly, and could (probably) restore some percentage of the files, I have only the demo (bit too expensive for that). I'm a bit surprised by the results, since it found all the files but many are missing names/metadata and I haven't written any data to it from the point where I executed zpool upgrade on it. Mehh...

inoperable commented 2 years ago

Could be worth settinv zfs_flags=1 to see if it is a soft problem.

I booted Linux with the zfs_flags=1 as a kernel flag and I added it within zfs.conf in /etc/modprobe.d since I think its actually a module option, isn't it? Anyways - I didn't noticed anything different with the flag configured - if thats how I am to configure it, that is.

inoperable commented 2 years ago

[UPDATE]

Z:\>zfs list
NAME    USED  AVAIL     REFER  MOUNTPOINT
drop    331G   118G      331G  legacy
stuff   998G   351G      998G  /vol/stuff

Z:\>o:
The system cannot find the drive specified.

Z:\>zfs set mountpoint=/vol/drop drop

Z:\>zfs mount
stuff                           Z:\
drop                            O:\

Z:\>o:

setting the mountpoint definitly influences the driveletter assignment, with mountpoint set to legacy the driver wont assign the dataset a driveletter. To make things less confusing:

Does windows actually can mount datasets under the path read from mountpoint property of a dataset? I mean, do the driver translates automatically normal paths to dumb confusing backslash windowish paths? If so, the root mountpoint would be then SystemDrive - in most cases being C:/ uhmm C:\ right?

[UPDATE II] Still no luck on restoring data from my music dataset. Need to RTFM :-)

[UPDATE III] Dataset has ben restored, sadly I lost about 30% of the contents. Lesson learned.

lundman commented 2 years ago

I suspect you didn't really lose any data on the updated dataset, and you can fish things out of it with zdb. That it panics trying to read it afterwards, we should also fix. Usually "import -N -o readony" tends to work to copy data off it.

As for the drive letter, by default, driveletter is set to "poolonly", so that the default pool (sometimes mounted at "/") will be driveltter=on, ie, don't try to put it in C:. Give it its own driveletter, let's say, E:. Then any lower dataset (say, root/lower, or /lower) is in E:\lower. That's the default anyway. You've been playing with driverletter, so it sounds like it has some issues, and since it's a new feature that isn't too surprising. The pool driveletter should be able to be set to "off" as well, and live in C: - but that does come with issues. With driveletter not set, it is supposed to inherit the mountpoint as per usual, and stick it in parent. And by default, driveletter is not set for anything "lower than the pool".

inoperable commented 2 years ago

@lundman so, after testing and fiddling for some hours I can definitely confirm that dataset's wont be assigned (mounted from a drive-letter root) a driveletter if the dataset's mountpoint is configured to none or legacy - configuring zfs mountpoint=/some/path dataset ensures the dateset gets a drive-letter assigned.

Data restoring: After trying to use zdb and zpool with all kind of flags, and going so far as to try zpool import -fFX music didn't yielded any result. I must use the -f flag otherwise zpool just tells me that the dataset was importet by some other os and I need the -f flag. Sooo...

music is the name of my borked dataset, so: zpool import -f -F music simply crashes the driver and I get a new coredump entry from zpool zpool import -f -F -o readonly=on music freezes - and opening new terminals/tttys and starting a new command freezes as well. Any zfs related thread just stops doing anything, it does not matter if I use zpool, zfs or zdb command - after trying to mount the dataset with -o readonly=on makes any other zfs related command inpossible to execute (it just freezes/hangs in place)

zdb behaves exactly the same, enforcing import by -X and using the proper block device path yields exactly the same, it either crashes - or freezes.

At this point I do not know how to use zdb to restore the data - since I can't execute any commands before import, and if I enforce an dataset import - zpool crashes, and if I enforce import with -o readolny=on - it freezes (and I cant do s**t after that)

Suggestions?

inoperable commented 2 years ago

@lundman ping

lundman commented 2 years ago

Apologies, been crazy getting ready for OpenZFS summit.

Usually, the first thing to try is to set zfs_recover=1 then import it using zpool import -N -o readonly=on poolname. The -N stops it from mounting.

If that works, you can zfs send things out. Or you could mount one dataset at a time (but I think you have just the one). If the commands freeze, it's most likely holding a lock while waiting for something. Be curious to know what, but it's complicated on Windows to get a stack of a hung process (unless task manager can?)

inoperable commented 2 years ago

No apologies necessary! I hope the OpenZFS summit will be great :-)

I try to fiddle with the pool on linux with the latest dkms-zfs mod and see what happens. Gona try to get the stack of a hung process from Windows with the latest Debug build, which freezes when forcing the import btw. No BSOD, just freeze

o.O

Thanks again for your help

On Thu, Oct 13, 2022 at 1:27 AM Jorgen Lundman @.***> wrote:

Apologies, been crazy getting ready for OpenZFS summit.

Usually, the first thing to try is to set zfs_recover=1 then import it using zpool import -N -o readonly=on poolname. The -N stops it from mounting.

If that works, you can zfs send things out. Or you could mount one dataset at a time (but I think you have just the one). If the commands freeze, it's most likely holding a lock while waiting for something. Be curious to know what, but it's complicated on Windows to get a stack of a hung process (unless task manager can?)

— Reply to this email directly, view it on GitHub https://github.com/openzfsonwindows/openzfs/issues/135#issuecomment-1276840847, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAL7SAQGY5CSZQVLFECG4BTWC5COVANCNFSM6AAAAAAQD6BBKQ . You are receiving this because you authored the thread.Message ID: @.***>

inoperable commented 2 years ago

added zfs_recover=1 and zfs_recovery=1 to kernel cmdline arguments, just to be sure. tried to import with the same args, just to be sure: > # zpool import -N -o readonly=on music

cannot import 'music': pool was previously in use from another system.
Last accessed by Windows (hostid=3696af32) at Sat Sep 10 12:01:08 2022
The pool can be imported, use zpool import -f to import the pool.

same command with -f flag added: > # zpool import -f -N -o readonly=on music

[  779.807298] Call Trace:
[  779.807301]  <TASK>
[  779.807304]  dump_stack_lvl+0x48/0x60
[  779.807309]  spl_panic+0xf4/0x10c [spl 9d02d408ea8c7c2f2e351b26e375f5a6976dae35]
[  779.807319]  ? zap_lookup+0x56/0x100 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807392]  spa_feature_enabled_txg+0x11f/0x130 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807456]  traverse_impl+0x3c1/0x4c0 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807513]  ? spa_vdev_err+0x40/0x40 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807576]  traverse_pool+0x68/0x1e0 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807632]  ? spa_vdev_err+0x40/0x40 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807691]  ? spa_vdev_err+0x40/0x40 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807749]  ? zio_root+0x33/0x40 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807814]  spa_load+0xda9/0x17c0 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807877]  spa_load_best+0x54/0x2c0 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807936]  spa_import+0x234/0x6b0 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.807994]  zfs_ioc_pool_import+0x15b/0x180 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.808059]  zfsdev_ioctl_common+0x8d9/0xa00 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.808123]  zfsdev_ioctl+0x53/0xe0 [zfs 12bc41acd1ed199cf14ce5057af5a3c3328f4437]
[  779.808185]  __x64_sys_ioctl+0x94/0xd0
[  779.808188]  do_syscall_64+0x5f/0x90
[  779.808190]  ? exit_to_user_mode_prepare+0x145/0x1d0
[  779.808193]  ? syscall_exit_to_user_mode+0x1b/0x40
[  779.808195]  ? do_syscall_64+0x6b/0x90
[  779.808197]  ? ksys_read+0xd8/0xf0
[  779.808198]  ? syscall_exit_to_user_mode+0x1b/0x40
[  779.808200]  ? do_syscall_64+0x6b/0x90
[  779.808202]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  779.808204] RIP: 0033:0x7f74f8045c0f
[  779.808206] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  779.808208] RSP: 002b:00007ffc9a726eb0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  779.808210] RAX: ffffffffffffffda RBX: 00005652ec699540 RCX: 00007f74f8045c0f
[  779.808211] RDX: 00007ffc9a727880 RSI: 0000000000005a02 RDI: 0000000000000003
[  779.808212] RBP: 00007ffc9a72ae70 R08: 000000000000b4e0 R09: 00007f74f81222c0
[  779.808213] R10: 00007f74f81222c0 R11: 0000000000000246 R12: 00007ffc9a727880
[  779.808213] R13: 00005652ec6a02f0 R14: 0000000000000000 R15: 00005652ec6d9650
[  779.808215]  </TASK>

adding the -f flag freezes any zfs related processes and no any other thing can be executed, cant zpool export or zfs umount, it just locks-up

If you need more info from me - I am open to suggestions :-)

sskras commented 5 months ago

@inoperable commented on Sep 17, 2022:

@lundman so, after testing and fiddling for some hours I can definitely confirm that dataset's wont be assigned (mounted from a drive-letter root) a driveletter if the dataset's mountpoint is configured to none or legacy - configuring zfs mountpoint=/some/path dataset ensures the dateset gets a drive-letter assigned.

Hello @inoperable. Have you checked that on some recent builds? I have a bit different issue and was searching for the matches, and found that one.

sskras commented 5 months ago

Ah, found an update. @inoperable commented on Apr 14, 2023 in #220:

This issue is solved, actually. What you need to do is import -> umount -> mount pools/datasets with timeouts to make sure Windows is capable of keep up (...)

Below a Task Scheduler I use plus my very commented .cmd file you can modify I added a small and extensive variant, Windows tends to enable administrative shares on drive letters so I added a netshare delete loop into that batch - just remove the whole thing. Hope you don't get a headache while going through the goto hell zfs_cmds.zip

I started to write that insanity initially with Powershell, but I quickly found out that I prefer .bat files instead of finding out about insane process handling in powershell pipes

@colemickens comment followed on May 28, 2023:

Respectfully, and I do appreciate it, but that doesn't seem "solved" to me?

So the common verdict is not clear enough :)

inoperable commented 5 months ago

@inoperable commented on Sep 17, 2022:

@lundman so, after testing and fiddling for some hours I can definitely confirm that dataset's wont be assigned (mounted from a drive-letter root) a driveletter if the dataset's mountpoint is configured to none or legacy - configuring zfs mountpoint=/some/path dataset ensures the dateset gets a drive-letter assigned.

Hello @inoperable. Have you checked that on some recent builds? I have a bit different issue and was searching for the matches, and found that one.

@sskras I did not, but not because of the driver: Windows cant handle hard/symbolic linking in a usable way you expect them to work - and it can land you with an scenario where things go bad. TLDR: keep non-native (refs/ntfs) parts of your data separated, interweaving inside ntfs/refs existing - structure is bad for you, windows can't and won't handle it and a lot of things get confused when lower/upper case files pop up in the same dir (among others) etc.

inoperable commented 5 months ago

@sskras in short: drive letter assignment is the way to have zfs work. I think sharesmb can be the way to go - it works pretty well (for non-windows drivers), I dont know if its possible to enable/use it under windows, @lundman any thoughts?

inoperable commented 5 months ago

@sskras: drive letter and junction links (not symbolic link mklink creates by default but the /J type junction) - this works well and I use it on daily basis without issues.

sskras commented 5 months ago

Does this mean the original issue is fixed?

inoperable commented 5 months ago

Does this mean the original issue is fixed?

Sorry for the confusion. if you mean mountpoints - I don't know, I dont mount datasets within ntfs/refs folders and I stick with driveletter and I use junctions to link those from ntfs/refs. Using zfs set driveletters=x works without problems- if you have many datasets/pools (guess 5+?) things can happen too fast and windows wont keep up, so zpool import -a -f can leave you with some pools mounted and some not (at least that was the case last time i checked), i wrote a batch script that iterates over the pools and imports those (and also removes administrative shares from them as well, which windows assigns by default, because why not...)

UPDATE: @sskras i read through 380, i think you face the same the "half-mount" those happen regardless of memory, since i got 128gb+ and this was still happening to me when mounting too fast (@lundman kudos for the "half-mount" term lol, i was thinking about paramount instead lol), so you need to time the mounting and it should work fine, you can stick with zfs set mountpoint=/your/normal/unix/path yourpool/dataset and zfs set driveletter=x yourpool/dataset - driveletter gets precedence and zfs wont mount under the mountpoint=... path (which is a life-saver in my case, since i use windows and random uni*xsh-os with the same pools/datasets) - and to interweave the path into ntfs/refs without windows craping out with mklink /j some\windows\dir x:\ where x: is the pool driveletter. Hope my convolutions are less convoluted