Open andrewc12 opened 10 months ago
Why is it trying to write to a zvol? In both cases zvol_os_attach
was called just before the end of buffer marker, then it goes through a write path that should only happen on the first write to the zvol
(comment: Open a ZIL if this is the first time we have written to this zvol.
).
Unfortunately I'm not sure if this pool had zvols on it
@andrewc12 could you try importing that pool again with zil_replay_disable
set?
I suspect in zvol_os_create_minor
zil_replay
failed, which caused zil_close
to not be called, but zv->zv_zilog
was still set to NULL. Now on first write, it sees that zv_zilog
is NULL and wants to zil_open
, but it is already open.
@EchterAgo I was being very silly, this pool pretty much only has a zvol on it. A sparse 2tb ntfs zvol.
set Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\OpenZFS\zfs_zil zil_replay_disable to 1 waited a minute zpool import -N -d C:\zfs tank_ntfs
at next boot checked Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\OpenZFS\zfs_zil zil_replay_disable is set to 1 zpool import -N -d C:\zfs tank_ntfs
Lesse, the very first crash, we die in
zilog_t *zilog = dmu_objset_zil(os);
"most likely" as os
is NULL, which is the same as zil_open(zv->zv_objset == NULL
as you are exporting it, we are probably freeing zv->zv_objset
and zv
, and another write came in at the perfect time.
Because zil
appeared not open (already closed) it attempts to open it (again) with a NULL.
So similar to unmount, we need to have a think about the teardown steps here.
I am unsure why the import "info.txt" indicate the exact same location, that is weird.
just noticed that was not related to this
OpenZFS!zil_open(struct objset * os = 0xffffde86`217af680, <function> * get_data = 0xfffff801`7e9acab0, struct zil_sums * zil_sums = 0x00000000`00000000)+0x8d [D:\a\openzfs\openzfs\module\zfs\zil.c @ 3800]
ok so not that
Why does it say "breakpoint" tho, it isnt the ASSERTs.
Why does it say "breakpoint" tho, it isnt the ASSERTs.
zil.c:3800
is ASSERT3P(zilog->zl_get_data, ==, NULL);
, isn't it?
I still think zil_open
was called on an already open zil
I classified it in my head as a similar problem to unmount problem, it needs to be a bit better protected. We go and close the zil and dmu, which is set to NULL, then get another read/write, so it attempts to open it again, and dmu is now NULL. So a similar style to use zfs_enter()
in the call-in to read/write. It got pushed down the priorities while we are dealing with the other thing.
bc14612
Only addresses the immediate crash
Describe the problem you're observing
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e) when trying to import pool
Describe how to reproduce the problem
previously
... ... zpool add tank_ntfs //./C:/zfs/ntfs_03.zfs
install zfs again
zpool import -N -d C:\zfs tank_ntfs zpool remove tank_ntfs //./C:/zfs/ntfs_03.zfs zpool remove -w tank_ntfs //./C:/zfs/ntfs_03.zfs zpool export tank_ntfs
crash 1
info.txt cbuf.txt
zpool import -N -d C:\zfs tank_ntfs
crash 2
info.txt cbuf.txt