Closed pipitone closed 9 years ago
The stack trace appears to be bogus following the kstat_seq_show(). The perl program must have been reading stuff from /proc/spl/zfs, do you know which items it might have been reading?
Ah. I was running ./arcstat.pl -f hit%,miss%,read,l2hit%,l2miss%,l2read,ph%,pm%,mm% 1
at the time, so that could be it.
I gather it was only reading /proc/spl/kstat/zfs/arcstats
from the looks of it. Has this ever happened before? If it happens again, try to get a stack trace of all processes on the system. I just tried a trivial reproducer and couldn't get it to cause any oopses.
@dweeezil "The stack trace appears to be bogus following the kstat_seq_show()."
Because you can't see how to get from kstat_seq_show()
to dbuf_stats_hash_table_data()
?
I think the missing piece of the puzzle is:
dbuf_stats_hash_table_init()
{
...
kstat_set_raw_ops(ksp, dbuf_stats_hash_table_headers,
dbuf_stats_hash_table_data, dbuf_stats_hash_table_addr);
...
}
I.e. ksp->ks_raw_ops.data() == dbuf_stats_hash_table_data()
, giving us:
kstat_seq_show()
{
...
rc = ksp->ks_raw_ops.data(
ksp->ks_raw_buf, ksp->ks_raw_bufsize, p);
...
}
=>
kstat_seq_show()
{
...
rc = dbuf_stats_hash_table_data(
ksp->ks_raw_buf, ksp->ks_raw_bufsize, p);
...
}
...which explains the call trace.
@chrisrd Yes, I guess I didn't look very closely. Are you running with spl_kmem_cache_slab_limit=16384
(or set to anything > 0)?
@chrisrd The only module settings I've made are to the zfs_arc_min/max
(listed in above).
@pipitone OK, some people would put the spl settings in spl.conf
. It looks like the dnode's db_dnode_handle
is NULL. I gather the script was reading /proc/spl/kstat/zfs/dbufs
. Were you also running dbufstat.py
?
In any case, I've got to wonder if simply skipping those dbufs would be a sufficient fix. Presumably it had been dbuf_clear()
'd.
@dweezil I wasn't running dbufstat.py
, no. Just arcstat.pl
.
So, does any of this explain why ZFS/NFS stopped responding?
@dweeezil I agree, for this to have happened the db->db_dnode_handle
must be NULL. Also something on the system must have been reading /proc/spl/kstat/zfs/dbufs
. It also looks like db->db_dnode_handle
can be set to NULL outside the protection of the db->db_mtx
. See dbuf_do_evict->dbuf_destroy
. I think we should skip any dbuf in the DB_EVICTING state, this flag is set under the lock.
diff --git a/module/zfs/dbuf_stats.c b/module/zfs/dbuf_stats.c
index 0cad9ef..b95f254 100644
--- a/module/zfs/dbuf_stats.c
+++ b/module/zfs/dbuf_stats.c
@@ -147,6 +147,12 @@ dbuf_stats_hash_table_data(char *buf, size_t size, void *data)
}
mutex_enter(&db->db_mtx);
+
+ if (db->db_state == DB_EVICTING) {
+ mutex_exit(&db->db_mtx);
+ continue;
+ }
+
mutex_exit(DBUF_HASH_MUTEX(h, dsh->idx));
length = __dbuf_stats_hash_table_data(buf, size, db);
@pipitone Yes, basically you hit a kernel panic. For now just make sure nothing on your system is accessing /proc/spl/kstat/zfs/dbufs
(including monitoring software) and it shouldn't happen again. In the meanwhile we'll fix this race/bug for the next release.
System was running fine up until the oops. Users tell me there were a few copy and tar operations going on at the time. NFS suddenly stopped responding. On the server
zfs list
would hang, butzfs list tank
would return.Running Ubuntu 12.04.4 LTS with EEC ram.
I'm running with two pools: