vdev_iokit branch - Githubissues

lundman commented 10 years ago

Thanks for trying out the iokit replacement. It does indeed appear to function well, and receives almost identical benchmark score as master.

https://github.com/openzfsonosx/zfs/blob/vdev-iokit/module/zfs/vdev_iokit.c#L88

Better watchout with those comments, you just took out the clearing on line 93. But since it is the free function, it doesn't matter ;)

https://github.com/openzfsonosx/zfs/blob/vdev-iokit/module/zfs/vdev_iokit.c#L340 https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/fs/zfs/vdev_disk.c#L416

Is the plan to extend it to attempt to open_by_guid as well? I see you implemented something in vdev_iokit_util.cpp, is it part of your roadmap to add that ability? If it can handle the device /dev/diskX moving, then it would make it quite attractive.

:)

evansus commented 10 years ago

@lundman open_by_guid and find_by_guid are now implemented - called in vdev_iokit_open after attempting the usual vdev_path and vdev_physpath. Also added vdev_iokit_find_pool that can check all disks for a pool with matching name.

In both, if multiple vdevs are found with a matching guid or pool name, the one with the best txg number is used. by_path simply checks for a matching BSD name and fails if it can't be found or opened.

ilovezfs commented 10 years ago

@evansus Are these secondary/tertiary/etc. code paths followed if the primary path for opening does not work? If so, some questions: 1) What is the primary path? 2) Have you "forced" the code to exercise the secondary/tertiary/etc. code paths by artificially causing the primary path to fail? If so, how? 3) If you have not done (2), how have you tested this code? 4) Have you tested any "failure" (e.g., the doggy unplugs a usb cable) scenarios yet? 5) Where does this code originate? Is FreeBSD the ultimate upstream for it, or was their code based on something in illumos? 6) Does our immediate upstream ZFS on Linux have analogous code, or are they solely relying on udev to provide the /dev/disk/by-* paths?

evansus commented 10 years ago

1) Primary path is as close to Illumos vdev_disk.c as possible - First for the case of whole disk, tries to open by path+s0, then by path. If the vdev is not labeled as whole disk, try to open by path, then physpath, and last by guid. That way the Illumos-compliant code path is followed, and failing that, we tried to find the disk by it's guid. FreeBSD's implementation is very similar, following vdev_disk.c and only using the guid search after all known routes have been explored.

2 and 3) I have tested by commenting out the find by path and by physpath routes, and verified that find by guid succeeds. Also, some of the tests within a VM showed that find by guid was being used at times - for example performing a hard reset of the VMWare fusion VM can sometimes cause disk0 and disk1 to swap places/names. Using IOLog to debug and watch the progress, I could see find by path match on one attempt, and find by guid match on another attempt (e.g. because find by path checked disk0s2, but it was not the same disk as last reboot).

4) Haven't tested failures and unexpected disconnects as yet. I verified that mirrored and raidZ were functional at one point, but haven't retested with some of the more recent changes. Although one thing I have tested, and have not run into issues with, is sleep/wake. Internal and USB disks have resumed operations without a problem.

5) Most of the vdev_iokit.c code is following Illumos vdev_disk.c. I started with a proof-of-concept and then reworked this file to follow upstream as much as possible. vdev_iokit_util.cpp is more loosely based on bits from vdev_geom.c from FreeBSD. Again it follows the standard route, and only attempts to find by guid when necessary. The files vdev_iokit_util.cpp and vdev_iokit_context.cpp provide the backend for functions that do not exist on OS X - replacements for vdev_disk_ldi_physio etc.

6) ZoL is similar to upstream vdev_disk.c, but then diverges in vdev_disk_open: https://github.com/zfsonlinux/zfs/blob/master/module/zfs/vdev_disk.c#L269-295

Instead using vdev_bdev_open (actually a #define macro) and vdev_disk_rrpart to detect and open disks. And yes, relying on udev by-id or custom udev rules to locate changed disks.

... you can provide your own udev rule to flexibly map the drives as you see fit.
It is not advised that you use the /dev/[hd]d devices which may be reordered
due to probing order.

I'm running this on my retina MacBook Pro, and haven't had any issues. Running 10.9.3 and using a GPT partition on the internal SSD as a single-disk pool. For testing I use external USB 2 and USB 3 hard drives and flash drives, as well as VMWare fusion VMs.

Please let me know if you have additional questions! Thanks, Evan

lundman commented 10 years ago

Yes the code similarities between vdev_iokit, geom and Solaris are comforting, and it is good you have exercised the paths. I have testing this branch as well, and have not experienced any issues. (But I don't have path renumbering issue though).

at some point though, I am hoping you will remove much of the commented out code, and work on indenting so we can give it a proper review :) I am of course guilty of such things too...

evansus commented 10 years ago

True. I recently browsed these on github.com and saw that the whitespace etc are all over the place - have been working from Xcode mostly. I'd be happy to do some cleanups etc.

lundman commented 10 years ago

I used to have emacs set to "uboot"'s coding standard, which is quite unusual (space, not tabs). Only recently changed emacs to use tabs again (last week) as ZFS uses tabs.

We could consider following ZOLs practise, I think they have a script to check code-standard. Dunno if we have to be so strict, but it could be something we should aim toward;

# ./scripts/cstyle.pl module/zfs/*.c | wc -l
    5771

heh all me :)

evansus commented 10 years ago

About testing changed device paths- I may be restating the obvious, but in case it helps:

Simplest test is creating two disk images and opening one after the other, then reverse the order.

You can also simulate device renumbering by either connecting/reconnecting USB devices in different orders, or by making any change to the disk list between connects. Opening or creating a disk image, ramdisk, zvol, etc.

lundman commented 10 years ago

Ahh so simple, and so beautiful. Yes, that should have been obvious :)

evansus commented 10 years ago

Tested mirrored and raidz with latest vdev-iokit branch. With either:

zpool create, export, import, and destroy are OK if all disks are present.
with a device missing, import panics.

Haven't tested device failure while in use, or path renumbering.

I'll look into resolving the missing vdev issue. Meanwhile I committed several whitespace cleanups - tabs instead of spaces - and other changes to conform to cstyle.pl, with a few exceptions.

evansus commented 10 years ago

Fixed, was due to an addition to vdev_get_size.

lundman commented 10 years ago

This is much better, thanks. Also appreciated you brought it up in line with master. This can probably be merged into master soon - do people feel ready for it?

ilovezfs commented 10 years ago

I count seven added #if 0's. Any chance we can clean out the dead code before merging?

rottegift commented 10 years ago

I've been beating vdev-iokit head (+ f0a31c6b7d5f4c4533b37ae1cbb05b95240af206) quite a bit and it seems pretty solid.

evansus commented 10 years ago

@rottegift That sounds like unreleased snapshot holds, most likely from an interrupted zfs send/recv?

I use this one-liner to check all snapshots for holds recursively:

zfs list -Hrt snapshot -o name | while read snap; do zfs holds -H "$snap"; done | more

rottegift commented 10 years ago

Yeah I realised it was unreleased snapshots after I left the comment, so deleted the comment, but probably not fast enough to stop you from seeing an email copy.

(The unreleased snapshots go away across an export/import or reboot, thus the confusion).

c.f. https://github.com/openzfsonosx/zfs/issues/173

P.S.: This will be faster

zfs list -H -r -t snap -o name,userrefs | grep -v '[^0-9]0$'  | awk '{ print $1 }' | xargs -n 10 zfs holds -H

(Didn't really examine closely, but zfs holds is unhappy with more than about ten args).

evansus commented 10 years ago

@rottegift, @lundman, and @ilovezfs (and the whole OpenZFS on OS X community), Thanks for testing the vdev-iokit branch, glad to hear it's working well for others as well!

I just committed some enhancements and cleanups. This addressed a few issues, mostly minor, some fairly important. ace3df2

I believe there are a few areas to review, as well as future enhancement areas.

1) flush write cache For example, DKIOCFLUSHWRITECACHE is issued as an asynchronous ioctl on Illumos, ZoL, and FreeBSD. I used the IOMedia::synchronizeCache, which is synchronous.

When called async, we can return ZIO_PIPELINE_STOP, then when the op completes, call zio_interrupt. Instead, we wait for the sync to complete, call zio_interrupt, then return pipeline_stop just after.

This is probably OK - should have the same end result, but I'm not sure if there are any negative implications. It may be as simple as adding a 'cacheflush' taskq to perform the sync and callback- I'm open to suggestions.

2) ashift Also, ashift is determined the same way as vdev_disk.c from other distributions, but is not assigned to vdev->vdev_ashift at this time. I left it commented out for now.

Haven't experimented with the ashift set recently, but in the past setting the ashift would cause pool import to fail (sees 'corrupt' vdevs). I'd like to review this and set it correctly.

vdev.c assigns ashift according to what is in the pool's configuration, and checks if child vdevs have their own ashift property.

This is working fine with every pool I've tried - including from vdev-iokit, master, and other distributions. Also I tested zpool create with no ashift specified (uses 9), -o ashift=9, and -o ashift=12, and verified using zdb that all worked as expected.

I'm not sure what would happen with a more complex pool layout - for example a zpool created as ashift=12, but with one vdev that was inadvertantly added with default ashift of 9, and vice-versa.

However for the common use-cases - pools with default ashift, ashift=9, and ashift=12, this has been working fine with every pool I've tried. I've imported and used pools created on FreeBSD, and created on the master branch of OpenZFS on OS X.

3) simple block devices The only bug that I can think of - and probably a non-issue anyway - is the case of block devices that are not published in IOKit. I don't know of any software doing this - except possibly some MacPorts / homebrew apps for Linux nbd and/or iSCSI? I doubt this would even come up, but if anyone is using or is aware of a software that would use this, please let me know.

As a potential workaround, I know that userland uses the vdev_file_ops for both files and block devices, so I'm sure we could find another way to interface with standard block devices if needed.

lundman commented 10 years ago

_2) ashift Also, ashift is determined the same way as vdev_disk.c from other distributions, but is not assigned to vdev->vdev_ashift at this time. I left it commented out for now._

There is indeed a difference here and with other distributions. It took us quite a while to work out why. We need to use the vdev_ashift value on the blocknum when translating offset requests, when upstream always uses 512. We eventually drilled down to that Darwin is actually large block aware in the lowest layer, when upstream stick with 512 for block to offset translation. Some upstream distributions use "byte offset" to avoid this, Darwin still uses "block number".

As in, the code:
bp->b_lblkno = lbtodb(offset); // IllumOS

would always go to 512 blocks (lbtodb). Whereas we use buf_setblkno(bp, zio->io_offset >> dvd->vd_ashift); because the underlying code knows the device block size in Darwin - and translates it back up to offset. (sigh).

but I see in the iokit code, the equivalent call is; https://github.com/openzfsonosx/zfs/blob/vdev-iokit/module/zfs/vdev_iokit_util.cpp#L1348 result = iokit_hl->IOMedia::read(zfs_hl, offset, buffer, 0, &actualByteCount); which takes offset (bytes) and should have no vdev_ashift logic at all. Can avoid that whole thing entirely.

I've been out of commission for a few days, but hope to get back into code review and testing this branch.

lundman commented 10 years ago

Each call to vdev_iokit_strategy() will allocate a context, and free it when we are done. Since strategy is a pretty frequent called operator, have we explored the idea of embedding the context struct into struct zio ? Although, my quick tests of putting "iokit_context" as char[32] into zio, to avoid alloc and free, didnt show any immediate improvement - but the 2min benchmarks are probably too small to show that.

rottegift commented 10 years ago

It's been good with https://github.com/openzfsonosx/zfs/commit/f0a31c6b7d5f4c4533b37ae1cbb05b95240af206 over the past couple of days and with https://github.com/openzfsonosx/zfs/commit/a97f499443c432a3571cbca7bdde716286eb516c for the past ca. 90 min of roughhousing (a couple of reboots and playing around with having to roll back ~80GiB of interrupted zfs recv into a dedup=sha256 dataset (just for fun and stress testing)).

evansus commented 10 years ago

@rottegift Glad to hear that!

@lundman yes I agree.

At least the vdev_iokit_context_t is currently a struct of just a few pointers:

typedef struct vdev_iokit_context {
    IOMemoryDescriptor * buffer;
    IOStorageCompletion completion;
} vdev_iokit_context_t;

The completion struct is defined in IOStorage.h:

struct IOStorageCompletion
{
    void *                    target;
    IOStorageCompletionAction action;
    void *                    parameter;
};

and IOStorageCompletionAction is also just a pointer to the callback function (from IOStorage.h):

typedef void (*IOStorageCompletionAction)(void *   target,
                                          void *   parameter,
                                          IOReturn status,
                                          UInt64   actualByteCount);

But yes it still is an alloc per IO, in this rev. I don't know if you noticed the IOCommandPools I tested in the previous rev, bit of an experiment: ccde144b9375b683de00eb07f5e7fa28744b8e12

Along with the allocation, the bigger issue is IOBufferMemoryDescriptor::withAddress and ::prepare being called from vdev_iokit_strategy. (called from vdev_iokit_io_start). At least afterwards it can issue the read() and write() calls async.

The best way to optimize this would be changing vdev_iokit_strategy so that all of its actual work is performed asynchronously, correct? (Perhaps using taskq?)

Other Async/sync function calls:

Currently this vdev-iokit branch (partially) issues async reads and writes, with completion callback.

However it would also be good to issue the flush cache requests 'async' with a completion callback, since it's async on all upstream repos.

Down the line, possibly Unmap, too - if/when we have an acceptable upstream ZFS trim patch to merge. :)

Looking at master, it seems vdev_disk.c has a synchronous vdev_disk_io_start, blocking for the duration of reads and writes.

Aside from that are a few minor issues with vdev_disk which I addressed in this branch: https://github.com/openzfsonosx/zfs/tree/vdev-disk-fixes Have not tested it though.

Another example is doAsyncReadWrite in zvolIO.cpp.

Would taskq's be an appropriate way to handle these function calls asynchronously? (By calling C -> C++ extern/wrapper functions like we're already doing?)

lundman commented 10 years ago

Yes, I estimated your vdev_iokit_context_t to be about 16 bytes in size, which is why I made a char [32] area in zio_t and used that to assign context. It worked well enough as proof of concept, and possibly shaved a second off, but hard to tell with my small test case. Either way, since we have APPLE only entries in znode_t and zfsvfs_t, it isn't too odd to have them in zio_t. The biggest hassles is figuring out the header dependency (depending on the size impact of including iokit, or using generic ptrs and casting).

As for IOBufferMemoryDescriptor::withAddress - I was under the impression this call takes an existing address (and buffer) and maps it to iokit space. Ie, no actual allocations happen. Not that a map operation is free or anything, but it is not as heavy as actual allocations.

Similarly prepare is used to page-in memory, if required. But I believe zio uses wired memory, so they should not/never be paged out.

3rd: I was under the impression that calling IOMedia::write and handing over a complete callback is already async. You mention flush cache requests as a possible place we can enhance it but I don't know if that is really worth doing. Not to discourage anyone for trying it and finding out though. :)

But I am new to iokit, so expect confusion :)

evansus commented 10 years ago

Yes, the IOMedia::write is async, however that is called after the IOBufferMemoryDescriptor is allocated and prepared.

In vdev_iokit_io_start, I'd like to issue an async call to vdev_iokit_strategy

The io_context and BufferMD allocation/prepare are already being done in vdev_iokit_strategy - but only the IOMedia::read/write is actually async.

About the cache flush, first I noticed that other platforms are issuing an async call. So I was thinking it would be best to handle this in the same way as other platforms. But I guess my question should be: Is it OK to call this in this order? IOMedia::syncronizeCache (synchronously) then return ZIO_PIPELINE_STOP

rather than an async flush, and cleanup in vdev_disk_ioctl_done.

Perhaps this would be a good question to post to the OpenZFS mailing list.

I have noticed that on a working system (not experiencing obvious issues), spindump shows some pretty long stack traces, where a whole stack is blocked waiting on other calls. This included vdev_iokit_io_start, vdev_iokit_strategy, vdev_iokit_sync, etc.

From some off-cpu flame graphs, it appears vdev_iokit_io_start and vdev_iokit_sync calls are intermittently spending a while off-cpu. Some are 10-20 _micro_seconds, but intermittently vdev_iokit_sync takes up to 80 _milli_seconds

https://gist.github.com/evansus/e0e34b60ba6dd993b4be dtrace output appears to have automatically sorted shortest to longest One caveat - this machine is also running other experimental patches including zvol-unmap, which I believe has a similar async/sync issue. May be contributing to the long times.

I'm new to the flame graphs as well though, so I could be using a poor dtrace script.

I wouldn't be surprised if my dtrace script is poor, I started with a basic script from Brendan Gregg's blog, and changed it to profile some vdev-iokit functions: https://gist.github.com/evansus/5dbb9082c4f1f336e47f

How does that look?

Also, from the IOBufferMediaDescriptor documentation, I was under the impression that bufferMD does allocate it's own memory.

IOBufferMemoryDescriptor

Overview
Provides a simple memory descriptor that allocates its own buffer memory.

I had hoped to use IOMemoryDescriptor instead, but MD::withAddress hasn't successfully read or written data into the zio->io_data buffer, and bufferMD::withAddress works fine.

Edit: Nevermind, read the number wrong. 80 milliseconds, not 800.

lundman commented 10 years ago

The flamegraphs are neat, but not sure that I can help there, it's a bit like black magic.. I'm sure you saw my wiki entry on them already, which is pretty much the culmination of my knowledge. Once I started to grep out just ZFS calls it became more useful though.

I didn't check the iokit sources, but the description is


Create an IOMemoryDescriptor to describe one virtual range of the kernel task.

Which implies to me that we create a new descriptor, using the given address, but do not allocate more memory. We will need to peek in the sources to know for sure.

Anyway, explore anything that catches your fancy, I was hoping you would move context into zio since it seems undesirable to allocate context in strategy, and worse, dealing with failure of said allocation. :)

rottegift commented 10 years ago

@evansus : when cache vdevs are missing at import, vdev_iokit and master don't automatically deal with the vdevs appearing subsequently, even if the devices match what zpool status -v expects. In master, if the devices have not been renamed, a zpool online pool dev / zpool clear pool dev seems to work. In vdev-iokit I had to zpool remove dev... the cache vdevs and zpool add cache dev... them with their new names.

Maybe you could add a fast usb3 thumb drive to one of your test setups to use as a cache vdev as you carry on with the iokit work - they work well for l2arc in front of spinny disks.

evansus commented 10 years ago

@rottegift Please try the latest commits to the vdev-iokit branch, cache and log devices survive renumbering after the latest changes (also updated from the current master branch).

It seems to function on my end with a few tests. I tried both manual import and import from cachefile after opening disk images in different sequence, etc.

Pool import is successful in all cases I've tried, with one minor issue to resolve: Importing from cachefile does not update the displayed diskNsN names (manual import does). The devices work normally, though - it's just showing the previous pathname in zpool status.

The solution to this will be updating vdev_path whenever it is necessary to import by physpath or guid, which I'm looking into.

evansus commented 10 years ago

@rottegift Ah, I re-read your comment and see what you mean. I just tested importing a pool with missing cache devices, then attached the disk image at a new device node - and I ran into the scenario you described.

zpool online, zpool replace, and even zpool replace -f couldn't resolve the pathname issue.

zpool remove tank disk3s1 then zpool add tank cache disk2s1 was necessary. So with USB devices that are attached after pool import, this could still be an issue.

Renumbered disks that are all available at import time should be fine though. @ilovezfs this does address some of the issues we discussed recently, but broke zpool split, which I'm working on resolving.

evansus commented 10 years ago

zpool split is resolved, tested split both with and without -R altroot (import new pool, or leave exported)

evansus commented 10 years ago

@rottegift Actually after testing the behavior on master, missing cache devices behave the same. When the device is in UNAVAIL state, it doesn't offline in the same way that data and log vdevs do. zpool replace is also the same:

bash-3.2# zpool replace test2 disk2s1 disk3s1
invalid vdev specification
use '-f' to override the following errors:
/dev/disk3s1 is part of unknown pool 'test2'
bash-3.2# zpool replace -f test2 disk2s1 disk3s1
cannot replace disk2s1 with disk3s1: device is in use as a cache

But I would like to find a solution for one or both branches.

With vdev-iokit, this only occurs when the device is not available at import time. Renumbered devices are found and used, but zpool status displays the old name.

On master, it also happens if the device has been renumbered and it is auto-importing from cachefile (e.g. during startup, if a USB drive appears after the zfs.kext loaded).

rottegift commented 10 years ago

@evansus Neat, I'll try to test this today.

I have three questions arising from this:

Importing from cachefile does not update the displayed diskNsN names (manual import does). The devices work normally, though - it's just showing the previous pathname in zpool status.

Is there any way to extract the real names (zdb?) with the code in the present state?

If the wrong names are showing in zpool status and at some point one of the physical devices posts errors or needs administrator attention in some way (e.g. zpool {clear,remove,offline,online} ) or outright replacing, is it even possible to do so correctly?

Finally, have you tested hot spares at all?

ilovezfs commented 10 years ago

@rottegift I don't understand the point of the question "If the wrong names are showing in zpool status and at some point one of the physical devices posts errors or needs administrator attention in some way (e.g. zpool {clear,remove,offline,online} ) or outright replacing, is it even possible to do so correctly?"

Clearly this bug must be fixed such that the names are correct at all times.

rottegift commented 10 years ago

@ilovezfs Well obviously it needs fixing. The point of the question is whether there is a temporary workaround.

ilovezfs commented 10 years ago

@rottegift I hope not, lest it impede the motivation for fixing it.

evansus commented 10 years ago

@rottegift the issue has been resolved if you manually zpool export and zpool import the pool. The cache vdevs are now displayed properly, and updated in the pool config. No workaround is required.

@ilovezfs understood, but I believe there is only an edge case that's not unique to this branch.

With the commits 94a6868ebb6fd9b15111a5ae151dcd6b3cebf680 and ed1463a6dc4b860714206880e8e1af71af69ba2b, data, log, and cache vdevs update their paths correctly when doing zpool import. The only issue is with zpool import -c /etc/zfs/zpool.cache.tmp when a cache device is missing.

The same issue happens on master - but as opposed to this branch, manually exporting and importing a pool does not solve it.

Besides export/import, there is at least one other way to resolve, as @rottegift pointed out. zpool remove tank disk3 followed by zpool add tank cache disk2 (adding back -o ashift if needed).

zpool online tank disk3 doesn't work, of course, as it's still looking for the device named 'disk3'. zpool replace tank disk3 disk2 reports that disk2 seems to be part of pool named 'tank'. zpool replace -f tank disk3 disk2 seems to run into a bug, on both master and vdev-iokit branches. It reports the disk is already in use as a cache device. zpool labelclear even with -f, fails as well.

Take a look at https://gist.github.com/evansus/fabf38f0279bd524e184 for some examples.

Also there is some interesting discussion about a separate (but related) issue on ZoL at https://github.com/zfsonlinux/zfs/issues/1530 including a recent post in the last few days. Related to that, @ryao added an option zpool status -g https://github.com/zfsonlinux/zfs/pull/2012.

However that ZoL discussion is a tangent - I haven't had any problem doing zpool remove tank disk3 whether disk3 was present or unavailable.

Since ZFS doesn't offline a degraded cache vdev in the same as log and data vdevs, it is possible to have an imported pool with missing cache devices. This isn't specific to OS X, I believe it happens on all platforms. We just have more frequent device renaming.

@rottegift I haven't tested hot spares yet but I'll look into that as well.

rottegift commented 10 years ago

It was up and running for a bit, then blam:

Anonymous UUID:       EA3E4DC2-8F4D-9BF6-7D16-4BB6CA19A914

Sun Jun 15 09:42:54 2014
panic(cpu 6 caller 0xffffff800a2dbf5e): Kernel trap at 0xffffff7f8a92aa14, type 13=general protection, registers:
CR0: 0x000000008001003b, CR2: 0x0000000094b6b3e0, CR3: 0x000000000c7d5000, CR4: 0x00000000001606e0
RAX: 0x694d2d73746e655d, RBX: 0xffffff81f94ff3b8, RCX: 0xffffff824cc99a20, RDX: 0x694d2d73746e655d
RSP: 0xffffff81fbb0bda0, RBP: 0xffffff81fbb0bdd0, RSI: 0xffffff829e0a0c08, RDI: 0xffffff824cc99b20
R8:  0x0000000000000000, R9:  0xffffff800a801910, R10: 0x00000000000003ff, R11: 0xffffffffffffffff
R12: 0xffffff81f94ff3f8, R13: 0x0000000000018bf9, R14: 0xffffff81f94ff3d8, R15: 0xffffff81ff794028
RFL: 0x0000000000010202, RIP: 0xffffff7f8a92aa14, CS:  0x0000000000000008, SS:  0x0000000000000000
Fault CR2: 0x0000000094b6b3e0, Error code: 0x0000000000000000, Fault CPU: 0x6

Backtrace (CPU 6), Frame : Return Address
0xffffff81ea7cddf0 : 0xffffff800a222fa9 mach_kernel : _panic + 0xc9
0xffffff81ea7cde70 : 0xffffff800a2dbf5e mach_kernel : _kernel_trap + 0x7fe
0xffffff81ea7ce040 : 0xffffff800a2f3456 mach_kernel : _return_from_trap + 0xe6
0xffffff81ea7ce060 : 0xffffff7f8a92aa14 net.lundman.zfs : _zio_walk_parents + 0x94
0xffffff81fbb0bdd0 : 0xffffff7f8a931693 net.lundman.zfs : _zio_done + 0x1073
0xffffff81fbb0bef0 : 0xffffff7f8a92cbaa net.lundman.zfs : ___zio_execute + 0x12a
0xffffff81fbb0bf30 : 0xffffff7f8a92ca75 net.lundman.zfs : _zio_execute + 0x15
0xffffff81fbb0bf50 : 0xffffff7f8a81e257 net.lundman.spl : _taskq_thread + 0xc7
0xffffff81fbb0bfb0 : 0xffffff800a2d7127 mach_kernel : _call_continuation + 0x17
      Kernel Extensions in backtrace:
         net.lundman.spl(1.0)[38C86468-875F-383E-942A-456BE7165109]@0xffffff7f8a81a000->0xffffff7f8a82afff
         net.lundman.zfs(1.0)[F735715F-8766-303D-B38C-A78A8D7EDD23]@0xffffff7f8a82b000->0xffffff7f8aa3cfff
            dependency: com.apple.iokit.IOStorageFamily(1.9)[9B09B065-7F11-3241-B194-B72E5C23548B]@0xffffff7f8a7ec000
            dependency: net.lundman.spl(1.0.0)[38C86468-875F-383E-942A-456BE7165109]@0xffffff7f8a81a000

BSD process name corresponding to current thread: kernel_task
Boot args: -v keepsyms=y darkwake=0 -s

Mac OS version:
13D65

Kernel version:
Darwin Kernel Version 13.2.0: Thu Apr 17 23:03:13 PDT 2014; root:xnu-2422.100.13~1/RELEASE_X86_64
Kernel UUID: ADD73AE6-88B0-32FB-A8BB-4F7C8BE4092E
Kernel slide:     0x000000000a000000
Kernel text base: 0xffffff800a200000
System model name: Macmini6,2 (Mac-F65AE981FFA204ED)

System uptime in nanoseconds: 1443109003767

rottegift commented 10 years ago

And another

Anonymous UUID:       EA3E4DC2-8F4D-9BF6-7D16-4BB6CA19A914

Sun Jun 15 09:53:41 2014
panic(cpu 0 caller 0xffffff8022adbf5e): Kernel trap at 0xffffff7fa301ca11, type 14=page fault, registers:
CR0: 0x0000000080010033, CR2: 0x0000000000098000, CR3: 0x00000000187df12d, CR4: 0x00000000001606e0
RAX: 0x0000000000098000, RBX: 0xffffff8229c1bf10, RCX: 0x0000000000000000, RDX: 0xffffff8212396818
RSP: 0xffffff8229c1b920, RBP: 0xffffff8229c1b920, RSI: 0xffffff8212396818, RDI: 0xffffff8212396818
R8:  0x0000000000540000, R9:  0xffffff821221cb30, R10: 0xffffff8229c1b940, R11: 0xffffffffffffff00
R12: 0x0000000000100003, R13: 0xffffff805282ed20, R14: 0xffffff8229c1be70, R15: 0x0000000000000048
RFL: 0x0000000000010282, RIP: 0xffffff7fa301ca11, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x0000000000098000, Error code: 0x0000000000000002, Fault CPU: 0x0

Backtrace (CPU 0), Frame : Return Address
0xffffff8229c1b5b0 : 0xffffff8022a22fa9 mach_kernel : _panic + 0xc9
0xffffff8229c1b630 : 0xffffff8022adbf5e mach_kernel : _kernel_trap + 0x7fe
0xffffff8229c1b800 : 0xffffff8022af3456 mach_kernel : _return_from_trap + 0xe6
0xffffff8229c1b820 : 0xffffff7fa301ca11 net.lundman.spl : _list_insert_tail + 0x21
0xffffff8229c1b920 : 0xffffff7fa30759f7 net.lundman.zfs : _dsl_dir_tempreserve_space + 0xd7
0xffffff8229c1b9a0 : 0xffffff7fa305c017 net.lundman.zfs : _dmu_tx_try_assign + 0x447
0xffffff8229c1ba50 : 0xffffff7fa305bb5b net.lundman.zfs : _dmu_tx_assign + 0x3b
0xffffff8229c1ba80 : 0xffffff7fa31079fb net.lundman.zfs : _zfs_write + 0x9eb
0xffffff8229c1bd00 : 0xffffff7fa3116c6e net.lundman.zfs : _zfs_vnop_write + 0xae
0xffffff8229c1bd60 : 0xffffff8022bfde51 mach_kernel : _VNOP_WRITE + 0xe1
0xffffff8229c1bde0 : 0xffffff8022bf3be3 mach_kernel : _utf8_normalizestr + 0x703
0xffffff8229c1be50 : 0xffffff8022df22f1 mach_kernel : _write_nocancel + 0x1b1
0xffffff8229c1bef0 : 0xffffff8022df21dd mach_kernel : _write_nocancel + 0x9d
0xffffff8229c1bf50 : 0xffffff8022e40653 mach_kernel : _unix_syscall64 + 0x1f3
0xffffff8229c1bfb0 : 0xffffff8022af3c56 mach_kernel : _hndl_unix_scall64 + 0x16
      Kernel Extensions in backtrace:
         net.lundman.spl(1.0)[38C86468-875F-383E-942A-456BE7165109]@0xffffff7fa301a000->0xffffff7fa302afff
         net.lundman.zfs(1.0)[F735715F-8766-303D-B38C-A78A8D7EDD23]@0xffffff7fa302b000->0xffffff7fa323cfff
            dependency: com.apple.iokit.IOStorageFamily(1.9)[9B09B065-7F11-3241-B194-B72E5C23548B]@0xffffff7fa2fec000
            dependency: net.lundman.spl(1.0.0)[38C86468-875F-383E-942A-456BE7165109]@0xffffff7fa301a000

rottegift commented 10 years ago

Also, if /etc/zfs/zpool.cache exists then these three pools do not finish importing before I give up. If it is the only cached pool then ssdpool alone takes several minutes. It takes only a second or so if there is no /etc/zfs/zpool.cache at boot and I do a manual import of ssdpool. Manual imports for the others are also fast (several seconds of chattering spinny disks, rather than not completing at all within many minutes and silence from the spinny disks).

  pool: Donkey
 state: ONLINE
  scan: resilvered 92K in 0h0m with 0 errors on Fri May 23 10:06:09 2014
config:

    NAME          STATE     READ WRITE CKSUM
    Donkey        ONLINE       0     0     0
      mirror-0    ONLINE       0     0     0
        disk19s2  ONLINE       0     0     0
        disk20s2  ONLINE       0     0     0
        disk24s2  ONLINE       0     0     0
    logs
      mirror-1    ONLINE       0     0     0
        disk9s1   ONLINE       0     0     0
        disk16s1  ONLINE       0     0     0
    cache
      disk33s3    ONLINE       0     0     0
      disk32s3    ONLINE       0     0     0
      disk34      ONLINE       0     0     0

errors: No known data errors

  pool: Trinity
 state: ONLINE
  scan: scrub repaired 0 in 63h53m with 0 errors on Thu May 22 21:52:40 2014
config:

    NAME          STATE     READ WRITE CKSUM
    Trinity       ONLINE       0     0     0
      mirror-0    ONLINE       0     0     0
        disk31    ONLINE       0     0     0
        disk23s2  ONLINE       0     0     0
      mirror-1    ONLINE       0     0     0
        disk25s2  ONLINE       0     0     0
        disk28s2  ONLINE       0     0     0
      mirror-2    ONLINE       0     0     0
        disk27s2  ONLINE       0     0     0
        disk26s2  ONLINE       0     0     0
      mirror-3    ONLINE       0     0     0
        disk30s2  ONLINE       0     0     0
        disk29s2  ONLINE       0     0     0
    logs
      mirror-4    ONLINE       0     0     0
        disk17s1  ONLINE       0     0     0
        disk13s1  ONLINE       0     0     0
    cache
      disk33s4    ONLINE       0     0     0
      disk32s4    ONLINE       0     0     0

errors: No known data errors

  pool: ssdpool
 state: ONLINE
  scan: scrub repaired 0 in 0h59m with 0 errors on Wed Jun 11 17:37:54 2014
config:

    NAME        STATE     READ WRITE CKSUM
    ssdpool     ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        disk10  ONLINE       0     0     0
        disk8   ONLINE       0     0     0
    logs
      mirror-1  ONLINE       0     0     0
        disk12  ONLINE       0     0     0
        disk7   ONLINE       0     0     0
    cache
      disk32s5  ONLINE       0     0     0
      disk33s5  ONLINE       0     0     0

errors: No known data errors

rottegift commented 10 years ago

@evansus "No workaround is required" -- here I agree with ilovezfs in that while this allows for testing, there should never be a time when the zfs subsystem is giving incorrect answers. Ever.

evansus commented 10 years ago

It’s not giving incorrect answers anymore

On Jun 15, 2014, at 5:04 AM, rottegift notifications@github.com wrote:

@evansus "No workaround is required" -- here I agree with ilovezfs in that while this allows for testing, there should never be a time when the zfs subsystem is giving incorrect answers. Ever.

— Reply to this email directly or view it on GitHub.

evansus commented 10 years ago

Regarding 'workaround': 94a6868 updates the path of renamed devices, so there should no longer be any misreported names.

When did the panics occur exactly? After a successful manual import?

Both panic logs seem at first glance to be unrelated to vdev-iokit changes, but I'm still looking into it.

_dmu_tx_try_assign -> _dsl_dir_tempreserve_space -> _list_insert_tail This appears to be in the ZFSVFS -> DMU layers.

_zio_done -> _zio_walk_parents individual zios are used by vdevs, but I'm not sure why or how zio_walk_parents would be affected.

rottegift commented 10 years ago

@evansus Ok, well, I'm running the tip of the vdev-iokit tree, so I have that.

Certainly it's possible the panics are unrelated; I dropped them here only because I'm running this branch.

Yes, this was after successful manual imports of all three pools, and beating them up a bit with real use.

If panics become frequent I'll switch to master and start opening new tickets for each panic.

rottegift commented 10 years ago

Here's a cool one:

Anonymous UUID:       EA3E4DC2-8F4D-9BF6-7D16-4BB6CA19A914

Sun Jun 15 11:24:35 2014
panic(cpu 0 caller 0xffffff802fadbf5e): Kernel trap at 0xffffff7fb013dfe0, type 14=page fault, registers:
CR0: 0x000000008001003b, CR2: 0x0000000000000000, CR3: 0x0000000026608068, CR4: 0x00000000001606e0
RAX: 0x0000000000000010, RBX: 0xffffff7fb0004820, RCX: 0x000000000003eb90, RDX: 0xffffff821b05b0e8
RSP: 0xffffff821b05b0b0, RBP: 0xffffff821b05b0e0, RSI: 0x0000000000000000, RDI: 0xffffff821f4bb908
R8:  0x0000000000000001, R9:  0xffffff82427a6194, R10: 0x000000000000008b, R11: 0x0000000000000020
R12: 0xffffff8050f34800, R13: 0xffffff8050f34800, R14: 0xffffff8234346c28, R15: 0xffffff805a00b400
RFL: 0x0000000000010282, RIP: 0xffffff7fb013dfe0, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x0000000000000000, Error code: 0x0000000000000000, Fault CPU: 0x0

Backtrace (CPU 0), Frame : Return Address
0xffffff821b05ad40 : 0xffffff802fa22fa9 mach_kernel : _panic + 0xc9
0xffffff821b05adc0 : 0xffffff802fadbf5e mach_kernel : _kernel_trap + 0x7fe
0xffffff821b05af90 : 0xffffff802faf3456 mach_kernel : _return_from_trap + 0xe6
0xffffff821b05afb0 : 0xffffff7fb013dfe0 net.lundman.zfs : _nv_mem_zalloc + 0x20
0xffffff821b05b0e0 : 0xffffff7fb0142474 net.lundman.zfs : _nvp_buf_alloc + 0x44
0xffffff821b05b130 : 0xffffff7fb0142298 net.lundman.zfs : _nvs_decode_pairs + 0x78
0xffffff821b05b170 : 0xffffff7fb0142106 net.lundman.zfs : _nvs_operation + 0xc6
0xffffff821b05b1b0 : 0xffffff7fb0143ccb net.lundman.zfs : _nvs_embedded + 0xcb
0xffffff821b05b1f0 : 0xffffff7fb014318d net.lundman.zfs : _nvs_xdr_nvp_op + 0x1fd
0xffffff821b05b270 : 0xffffff7fb01422ca net.lundman.zfs : _nvs_decode_pairs + 0xaa
0xffffff821b05b2b0 : 0xffffff7fb0142106 net.lundman.zfs : _nvs_operation + 0xc6
0xffffff821b05b2f0 : 0xffffff7fb0141f15 net.lundman.zfs : _nvs_xdr + 0x85
0xffffff821b05b360 : 0xffffff7fb0141a9b net.lundman.zfs : _nvlist_common + 0x1eb
0xffffff821b05b3f0 : 0xffffff7fb0141db9 net.lundman.zfs : _nvlist_xunpack + 0x79
0xffffff821b05b440 : 0xffffff7fb0141d2d net.lundman.zfs : _nvlist_unpack + 0x4d
0xffffff821b05b490 : 0xffffff7fb00bffed net.lundman.zfs : _vdev_iokit_read_label + 0x1ed
0xffffff821b05b510 : 0xffffff7fb00c09d7 net.lundman.zfs : _vdev_iokit_find_by_guid + 0x107
0xffffff821b05b580 : 0xffffff7fb00c1134 net.lundman.zfs : _vdev_iokit_open_by_guid + 0x24
0xffffff821b05b5b0 : 0xffffff7fb00bf84a net.lundman.zfs : _vdev_iokit_open + 0x36a
0xffffff821b05b630 : 0xffffff7fb00b8ddb net.lundman.zfs : _vdev_open + 0x12b
0xffffff821b05b6b0 : 0xffffff7fb00b8b7b net.lundman.zfs : _vdev_open_children + 0x5b
0xffffff821b05b6e0 : 0xffffff7fb00c522f net.lundman.zfs : _vdev_mirror_open + 0x5f
0xffffff821b05b750 : 0xffffff7fb00b8ddb net.lundman.zfs : _vdev_open + 0x12b
0xffffff821b05b7d0 : 0xffffff7fb00b8b7b net.lundman.zfs : _vdev_open_children + 0x5b
0xffffff821b05b800 : 0xffffff7fb00cd1bf net.lundman.zfs : _vdev_root_open + 0x5f
0xffffff821b05b850 : 0xffffff7fb00b8ddb net.lundman.zfs : _vdev_open + 0x12b
0xffffff821b05b8d0 : 0xffffff7fb00a5beb net.lundman.zfs : _spa_load_impl + 0x1ab
0xffffff821b05ba10 : 0xffffff7fb00a016b net.lundman.zfs : _spa_load + 0x1fb
0xffffff821b05ba80 : 0xffffff7fb009f838 net.lundman.zfs : _spa_load_best + 0x98
0xffffff821b05baf0 : 0xffffff7fb009bc6a net.lundman.zfs : _spa_open_common + 0x19a
0xffffff821b05bb80 : 0xffffff7fb009c074 net.lundman.zfs : _spa_get_stats + 0x64
0xffffff821b05bc00 : 0xffffff7fb00f2457 net.lundman.zfs : _zfs_ioc_pool_stats + 0x37
0xffffff821b05bc30 : 0xffffff7fb00ef163 net.lundman.zfs : _zfsdev_ioctl + 0x643
0xffffff821b05bcf0 : 0xffffff802fc0d63f mach_kernel : _spec_ioctl + 0x11f
0xffffff821b05bd40 : 0xffffff802fbfe000 mach_kernel : _VNOP_IOCTL + 0x150
0xffffff821b05bdc0 : 0xffffff802fbf3e51 mach_kernel : _utf8_normalizestr + 0x971
0xffffff821b05be10 : 0xffffff802fdc1303 mach_kernel : _fo_ioctl + 0x43
0xffffff821b05be40 : 0xffffff802fdf2c66 mach_kernel : _ioctl + 0x466
0xffffff821b05bf50 : 0xffffff802fe40653 mach_kernel : _unix_syscall64 + 0x1f3
0xffffff821b05bfb0 : 0xffffff802faf3c56 mach_kernel : _hndl_unix_scall64 + 0x16
      Kernel Extensions in backtrace:
         net.lundman.zfs(1.0)[F735715F-8766-303D-B38C-A78A8D7EDD23]@0xffffff7fb002b000->0xffffff7fb023cfff
            dependency: com.apple.iokit.IOStorageFamily(1.9)[9B09B065-7F11-3241-B194-B72E5C23548B]@0xffffff7faffec000
            dependency: net.lundman.spl(1.0.0)[38C86468-875F-383E-942A-456BE7165109]@0xffffff7fb001a000

rottegift commented 10 years ago

When I boot with this branch and there is an /etc/zfs/zpool.cache I have problems, although I'll try to narrow them down. The most obvious one is that nothing happens until I manually issue a "zpool list" (I have deliberately done launchctl unload -wF /Library/LaunchDaemons/org.openzfs.zpool-autoimport.plist). At that point some zed activity kicks off:

15/06/2014 11:05:58.205 zed[1273]: eid=1 class=statechange 
15/06/2014 11:05:58.271 zed[1275]: eid=2 class=statechange 
15/06/2014 11:06:02.278 zed[1300]: eid=3 class=statechange 
15/06/2014 11:06:02.791 zed[1302]: eid=4 class=statechange 
15/06/2014 11:06:03.114 zed[1308]: eid=5 class=statechange 
15/06/2014 11:06:03.621 zed[1310]: eid=6 class=statechange 
15/06/2014 11:06:03.702 zed[1312]: eid=7 class=statechange 
15/06/2014 11:06:03.984 zed[1314]: eid=8 class=statechange 
15/06/2014 11:06:04.361 zed[1320]: eid=9 class=statechange 
15/06/2014 11:06:04.761 zed[1322]: eid=10 class=statechange 
15/06/2014 11:06:05.489 zed[1331]: eid=11 class=vdev.open_failed pool=Donkey
15/06/2014 11:06:05.802 zed[1333]: eid=12 class=vdev.open_failed pool=Donkey
15/06/2014 11:06:06.178 zed[1339]: eid=13 class=vdev.open_failed pool=Donkey
15/06/2014 11:06:09.454 zed[1362]: eid=14 class=zvol.create pool=Donkey
15/06/2014 11:06:09.472 zed[1371]: eid=14 class=zvol.create pool=Donkey/TM symlinked disk35
15/06/2014 11:06:09.522 zed[1378]: eid=15 class=zvol.create pool=Donkey
15/06/2014 11:06:09.535 zed[1386]: eid=15 class=zvol.create pool=Donkey/Caching symlinked disk36
15/06/2014 11:06:09.864 zed[1409]: eid=16 class=zvol.create pool=Donkey
15/06/2014 11:06:09.877 zed[1417]: eid=16 class=zvol.create pool=Donkey/TMMIS symlinked disk37

but "zpool list" never finishes. The two zvols are mounted fine, and I can do operations within them, but I cannot unmount them. No zfs filesystems in the same pool are mounted, and I cannot mount them manually.

I am guessing the problem is the presence of the zvols. I'll try starting up with a zpool.cache that does not have that pool (the others don't have zvols).

rottegift commented 10 years ago

With ssdpool (which has no zvols) as the only imported pool, a restart results in nothing happening until I issue zpool list, which ultimately imports ssdpool. The cache vdevs have new numbers upon reboot, and this was not caught at startup:

  pool: ssdpool
 state: ONLINE
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://zfsonlinux.org/msg/ZFS-8000-2Q
  scan: scrub in progress since Sun Jun 15 10:57:06 2014
    42.2G scanned out of 173G at 26.4M/s, 1h24m to go
    0 repaired, 24.46% done
config:

    NAME        STATE     READ WRITE CKSUM
    ssdpool     ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        disk16  ONLINE       0     0     0
        disk9   ONLINE       0     0     0
    logs
      mirror-1  ONLINE       0     0     0
        disk6   ONLINE       0     0     0
        disk10  ONLINE       0     0     0
    cache
      disk32s5  UNAVAIL      0     0     0  cannot open
      disk33s5  UNAVAIL      0     0     0  cannot open

errors: No known data errors

A manual import of Trinity does the right thing:

zpool status -v Trinity
  pool: Trinity
 state: ONLINE
  scan: scrub repaired 0 in 63h53m with 0 errors on Thu May 22 21:52:40 2014
config:

    NAME          STATE     READ WRITE CKSUM
    Trinity       ONLINE       0     0     0
      mirror-0    ONLINE       0     0     0
        disk30    ONLINE       0     0     0
        disk31s2  ONLINE       0     0     0
      mirror-1    ONLINE       0     0     0
        disk34s2  ONLINE       0     0     0
        disk32s2  ONLINE       0     0     0
      mirror-2    ONLINE       0     0     0
        disk29s2  ONLINE       0     0     0
        disk33s2  ONLINE       0     0     0
      mirror-3    ONLINE       0     0     0
        disk27s2  ONLINE       0     0     0
        disk28s2  ONLINE       0     0     0
    logs
      mirror-4    ONLINE       0     0     0
        disk14s1  ONLINE       0     0     0
        disk7s1   ONLINE       0     0     0
    cache
      disk23s4    ONLINE       0     0     0
      disk22s4    ONLINE       0     0     0

errors: No known data errors

Note that the status shows the cache vdevs of ssdpool (imported automatically) overlap the storage vdevs of Trinity (imported manually), which is wrong and dangerous.

The cache vdevs were there and numbered as disk23sX and disk24sX from early in startup, i.e., before ZFS was loaded.

rottegift commented 10 years ago

A restart with both ssdpool and Trinity in the cache leads to an automatic import with the cache vdevs having the wrong numbers.

None of the zfs datasets mounted.

"zpool export ssdpool" has been running for about six minutes, which is a surprisingly long time. Perhaps it's stuck waiting on disk arbitration (i.e., things are stuck behind the datasets that are canmount=yes which have not yet mounted)?

  pool: Trinity
 state: ONLINE
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://zfsonlinux.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 63h53m with 0 errors on Thu May 22 21:52:40 2014
config:

    NAME          STATE     READ WRITE CKSUM
    Trinity       ONLINE       0     0     0
      mirror-0    ONLINE       0     0     0
        disk21    ONLINE       0     0     0
        disk22s2  ONLINE       0     0     0
      mirror-1    ONLINE       0     0     0
        disk33s2  ONLINE       0     0     0
        disk32s2  ONLINE       0     0     0
      mirror-2    ONLINE       0     0     0
        disk26s2  ONLINE       0     0     0
        disk27s2  ONLINE       0     0     0
      mirror-3    ONLINE       0     0     0
        disk34s2  ONLINE       0     0     0
        disk31s2  ONLINE       0     0     0
    logs
      mirror-4    ONLINE       0     0     0
        disk13s1  ONLINE       0     0     0
        disk8s1   ONLINE       0     0     0
    cache
      disk23s4    UNAVAIL      0     0     0  cannot open
      disk22s4    UNAVAIL      0     0     0  cannot open

errors: No known data errors

  pool: ssdpool
 state: ONLINE
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://zfsonlinux.org/msg/ZFS-8000-2Q
  scan: scrub canceled on Sun Jun 15 12:02:41 2014
config:

    NAME        STATE     READ WRITE CKSUM
    ssdpool     ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        disk11  ONLINE       0     0     0
        disk6   ONLINE       0     0     0
    logs
      mirror-1  ONLINE       0     0     0
        disk17  ONLINE       0     0     0
        disk5   ONLINE       0     0     0
    cache
      disk32s5  UNAVAIL      0     0     0  cannot open
      disk33s5  UNAVAIL      0     0     0  cannot open

errors: No known data errors

evansus commented 10 years ago

Hm. If it is still trying to export, do you mind running sudo spindump and posting a gist of the output?

rottegift commented 10 years ago

Ha, just as I was typing sure, it exited:

"$ zpool export ssdpool cannot export 'ssdpool': pool is busy"

rottegift commented 10 years ago

Since I believe I reproduce this (and now think the long running time of "zpool list" et a is not related to the presence of zvols but rather to the automounting of datasets) I'll try it again momentarily. I have an idea about what caused the panic that I want to test first.

rottegift commented 10 years ago

The panic happens when "zfs send -R -I"-ing hits a particular dataset. I think I can get a reduced test case shortly.

The system is back up and in the absence of the zpool-autoimport launchd job nothing is happening. (The first thing the script does that wakes things up is the zpool list). A "zpool status -v" is hanging. I'll dwic about the spin dump.

rottegift commented 10 years ago

Oops, it returned really quickly this time. Weird. The cache vdevs are wrong again, and the zfs datasets have yet to mount.

rottegift commented 10 years ago

Lonnnnng wait on "zpool export ssdpool". No mounted filesystems.

https://gist.github.com/rottegift/00a1ae8fb4491b711e04

a couple minutes later

https://gist.github.com/rottegift/ff93ec481246461a791a

openzfsonosx / zfs

vdev_iokit branch #178