zfsrogue / zfs-crypto

ZFS On Linux with crypto patches
Other
39 stars 7 forks source link

SPLError: 11346:0:(zio.c:792:zio_write()) #39

Open FransUrbo opened 10 years ago

FransUrbo commented 10 years ago

Can't say for certain if this is a ZoL issue or ZFS-Crypto one (can't test without crypto since I have the encryption feature enable on my pool).

Trying to create a filesystem with DEBUG enabled on both spl and zfs gives:

Message from syslogd@Celia at Dec  6 13:26:19 ...
 kernel:[  225.960618] SPLError: 11346:0:(zio.c:792:zio_write()) ASSERTION(zp->zp_checksum >= ZIO_CHECKSUM_OFF && zp->zp_checksum < ZIO_CHECKSUM_FUNCTIONS && zp->zp_compress >= ZIO_COMPRESS_OFF && zp->zp_compress < ZIO_COMPRESS_FUNCTIONS && DMU_OT_IS_VALID(zp->zp_type) && zp->zp_crypt >= ZIO_CRYPT_OFF && zp->zp_crypt < ZIO_CRYPT_FUNCTIONS && zp->zp_type < DMU_OT_NUMTYPES && zp->zp_level < 32 && zp->zp_copies > 0 && zp->zp_copies <= spa_max_replication(spa)) failed

Message from syslogd@Celia at Dec  6 13:26:19 ...
 kernel:[  225.960943] SPLError: 11346:0:(zio.c:792:zio_write()) SPL PANIC

Doing a git blame on zio.c gives:

b128c09f zfs/lib/libzpool/zio.c (Brian Behlendorf  2008-12-03 12:09:06 -0800  782)      ASSERT(zp->zp_checksum >= ZIO_CHECKSUM_OFF &&
b128c09f zfs/lib/libzpool/zio.c (Brian Behlendorf  2008-12-03 12:09:06 -0800  783)          zp->zp_checksum < ZIO_CHECKSUM_FUNCTIONS &&
b128c09f zfs/lib/libzpool/zio.c (Brian Behlendorf  2008-12-03 12:09:06 -0800  784)          zp->zp_compress >= ZIO_COMPRESS_OFF &&
b128c09f zfs/lib/libzpool/zio.c (Brian Behlendorf  2008-12-03 12:09:06 -0800  785)          zp->zp_compress < ZIO_COMPRESS_FUNCTIONS &&
b4192bb9 module/zfs/zio.c       (Brian Behlendorf  2012-12-13 15:24:15 -0800  786)          DMU_OT_IS_VALID(zp->zp_type) &&
c5731b91 module/zfs/zio.c       (ZFS Rogue         2012-11-06 12:01:00 +0000  787)            zp->zp_crypt >= ZIO_CRYPT_OFF &&
c5731b91 module/zfs/zio.c       (ZFS Rogue         2012-11-06 12:01:00 +0000  788)            zp->zp_crypt < ZIO_CRYPT_FUNCTIONS &&
b128c09f zfs/lib/libzpool/zio.c (Brian Behlendorf  2008-12-03 12:09:06 -0800  789)          zp->zp_type < DMU_OT_NUMTYPES &&
b128c09f zfs/lib/libzpool/zio.c (Brian Behlendorf  2008-12-03 12:09:06 -0800  790)          zp->zp_level < 32 &&
428870ff module/zfs/zio.c       (Brian Behlendorf  2010-05-28 13:45:14 -0700  791)          zp->zp_copies > 0 &&
03c6040b module/zfs/zio.c       (George Wilson     2013-05-10 12:47:54 -0700  792)          zp->zp_copies <= spa_max_replication(spa));

Commit b128c09f, b4192bb9, 428870ff are all ancient (Dec 3 2008, Dec 13 2012 and May 28 2010 respectivly), 03c6040b is quite new (May 10 2013) and then there's the zfs-crypto one...

After the SPLEerror, zfs hangs. Creating a filesystem without DEBUG enabled works (or did work a few days ago when I was running without it - have cherry-picked some commits since).

lundman commented 10 years ago

Looks like the assert is stale, relying on old style. You can comment it out, or figure out which of the tests fail.

FransUrbo commented 10 years ago

Adding some printk()'s, I get (numerous times - all identical):

ZIO_CHECKSUM_OFF=2, ZIO_CHECKSUM_FUNCTIONS=11, ZIO_COMPRESS_OFF=2, ZIO_COMPRESS_FUNCTIONS=16, ZIO_CRYPT_OFF=2, ZIO_CRYPT_FUNCTIONS=10, DMU_OT_NUMTYPES=55

And just before the SPL Panic:

zp->zp_checksum=7, zp->zp_compress=3, zp->zp_crypt=2, zp->zp_type=196, zp->zp_level=0, zp->zp_copies=3

Full log at http://bayour.com/misc/SPLError.txt.

FransUrbo commented 10 years ago

Changing the way I debug a couple of times, I get this:

zp->zp_type(196) < DMU_OT_NUMTYPES(55)
FransUrbo commented 10 years ago

So somehow/somewhere zp_type is set to an unvalid number. I'm going to take this up on the ZoL issue tracker as well. Just in case...

Ignore for now.