openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

looping in zfs_unlinked_drain? #2240

Closed prometheanfire closed 10 years ago

prometheanfire commented 10 years ago

here are all backtraces (and a couple of sysrq-w) thown in.

The system is still up, so if you want any live debugging, tell me now :D

forgot the link... https://gist.github.com/prometheanfire/9988183 and the patches I'm using... https://gist.github.com/9999082

prometheanfire commented 10 years ago

could it be related to this or no? https://github.com/zfsonlinux/spl/pull/340

behlendorf commented 10 years ago

@prometheanfire I don't think this is related to the spl issue. I assume the only problem this is causing in the rcu stall? It would be useful to get the perf top output for the system if you're seeing it take a large amount of cpu time.

Beyond that if you're rebooting the server it would be very useful just to run with the latest master code. It contains @ryao's zfs_zget() change but not the additional writepage work. If you're able to cause problems we'll want to consider reverting those changes.

ryao commented 10 years ago

@behlendorf I asked @prometheanfire to run perf record -F 997 -p $PID -g -- sleep 10 on the hung thread to determine if it had deadlocked or it was looping infinitely. The resulting flame graph strongly suggests that the thread is looping infinitely:

http://dev.gentoo.org/~ryao/zfs-issue-2240.svg

The pre-processed data from perf script | ./stackcollapse-perf.pl is available at a pastebin:

http://bpaste.net/show/199721/

Interestingly, the infinite loop appears to be in zfs_unlinked_drain(), rather than zfs_zget() as I had thought.

avg-I commented 10 years ago

This is just a guess. Does anybody know if it is safe to modify a ZAP object while iterating over it with zap cursor (within the iteration loop) ?

Let me explain the question. I see the following in ZoL version of zfs_unlinked_drain, which is not present in other OpenZFS variants:

        /*
         * Iterate over the contents of the unlinked set.
         */
        for (zap_cursor_init(&zc, zsb->z_os, zsb->z_unlinkedobj);
            zap_cursor_retrieve(&zc, &zap) == 0;
            zap_cursor_advance(&zc)) {
...
                /*
                 * If this is an attribute directory, purge its contents.
                 */
                if (S_ISDIR(ZTOI(zp)->i_mode) && (zp->z_pflags & ZFS_XATTR)) {
                        /*
                         * We don't need to check the return value of
                         * zfs_purgedir here, because zfs_rmnode will just
                         * return this xattr directory to the unlinked set
                         * until all of its xattrs are gone.
                         */
                        (void) zfs_purgedir(zp);
                }
...
        }
        zap_cursor_fini(&zc);

So, the code iterates over object IDs in z_unlinkedobj and it may call zfs_purgedir which ends up adding stuff to z_unlinkedobj.

avg-I commented 10 years ago

I can reproduce this almost every time I crash a system or forcefully reboot it (e.g. power cycle).

ryao commented 10 years ago

@avg-I How many files do you have in the filesystem? Is the forced crash/reboot under any particular workload?

behlendorf commented 10 years ago

@avg-I Could you try the patches in #2573, they're designed to address this problem. They've passed all the automated testing and held up perfectly but I'd like to make sure they fix your test case.

Specifically these changes revert the code back so it largely matches the OpenZFS implementation. However, a few iput()'s were left asynchronous to avoid certain deadlocks which can occur under Linux but not the other platforms.

Does anybody know if it is safe to modify a ZAP object while iterating over it with zap cursor (within the iteration loop) ?

This is safe in the sense that it won't deadlock. But unless you lock the entire zap for the traversal you might entries due to concurrent adds/removes. So it's a good thing to avoid if possible.