openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.67k stars 1.76k forks source link

zfs rollback crash #431

Closed asnow0 closed 13 years ago

asnow0 commented 13 years ago

Running latest ZFS git on Ubuntu 10.04 (using kernel 3.0.0) and I'm seeing a crash mainly during ZFS rollbacks. The rollback hangs, and any processes that were accessing data on the ZFS volumes become blocked. Below is the call trace I get when this issue occurs:

[ 670.770635] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 670.770814] IP: [] sa_object_size+0x9/0x20 [zfs] [ 670.770989] PGD 137afa067 PUD 125c39067 PMD 0 [ 670.771103] Oops: 0000 [#2] SMP [ 670.771184] CPU 2 [ 670.771227] Modules linked in: sch_sfq cls_u32 sch_cbq vboxnetadp vboxnetflt vboxdrv zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate hwmon_vid coretemp ipmi_msghandler ppdev i2c_i801 i915 lp drm_kms_helper serio_raw parport_pc parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov pata_jmicron ahci libahci raid6_pq async_tx raid1 raid0 multipath linear [ 670.780014] [ 670.780014] Pid: 14320, comm: php Tainted: P D 3.0.0-datto9 #23 Gigabyte Technology Co., Ltd. D525TUD/D525TUD [ 670.780014] RIP: 0010:[] [] sa_object_size+0x9/0x20 [zfs] [ 670.780014] RSP: 0018:ffff8801292e1e18 EFLAGS: 00010206 [ 670.780014] RAX: 0000000028c6acc8 RBX: ffff880137c681b8 RCX: 0000000000000009 [ 670.780014] RDX: ffff8801292e1f58 RSI: ffff8801292e1f50 RDI: 0000000000000000 [ 670.780014] RBP: ffff8801292e1e18 R08: ffffffff81148d65 R09: ffff880104e6fb81 [ 670.780014] R10: ffff88013f006400 R11: ffff8801292e1d48 R12: ffff8801292e1ef8 [ 670.780014] R13: ffff880137c68048 R14: ffff880129195000 R15: 0000000000000000 [ 670.830096] FS: 00007f64a2e54720(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000 [ 670.830096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 670.830096] CR2: 0000000000000020 CR3: 00000001297c7000 CR4: 00000000000006e0 [ 670.830096] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 670.830096] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 670.830096] Process php (pid: 14320, threadinfo ffff8801292e0000, task ffff880137111440) [ 670.830096] Stack: [ 670.830096] ffff8801292e1e48 ffffffffa0273243 ffff880104e6fb40 ffff880123e8d400 [ 670.830096] ffff880137c681b8 ffff8801292e1ef8 ffff8801292e1e58 ffffffffa0288195 [ 670.830096] ffff8801292e1e98 ffffffff8114211e 0000000000000000 ffff88013b2edf00 [ 670.830096] Call Trace: [ 670.830096] [] zfs_getattr_fast+0x73/0xb0 [zfs] [ 670.881069] [] zpl_getattr+0x15/0x20 [zfs] [ 670.881069] [] vfs_getattr+0x4e/0x80 [ 670.881069] [] vfs_fstatat+0x70/0x90 [ 670.881069] [] vfs_stat+0x1b/0x20 [ 670.881069] [] sys_newstat+0x24/0x50 [ 670.881069] [] system_call_fastpath+0x16/0x1b [ 670.881069] Code: 48 89 f3 e8 3a 77 fc ff 48 c7 43 20 00 00 00 00 48 83 c4 08 5b c9 c3 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 66 66 66 66 90 [ 670.881069] 8b 7f 20 e8 3e ee fc ff c9 c3 66 66 66 2e 0f 1f 84 00 00 00 [ 670.881069] RIP [] sa_object_size+0x9/0x20 [zfs] [ 670.881069] RSP [ 670.881069] CR2: 0000000000000020 [ 670.882719] ---[ end trace e090e4ed91176baa ]---

If there's any more information that would be helpful, just ask and I'll try and provide it.

rohan-puri commented 13 years ago

Can you provide, reproduction steps. It will be helpful

asnow0 commented 13 years ago

Here's what seems to be causing the issue: -A ZFS snapshot is taken on a particular ZFS filesystem -Data in that filesystem is modified -A zfs rollback is run on the ZFS filesystem to the snapshot that was previously created -the rollback hangs and the call trace I posted appears in dmesg This seems to only happen on certain ZFS filesystems, since we have several machines where everything's working fine but certain machines are seeing this same failure.

On 11/2/2011 4:54 AM, rohan-puri wrote:

Can you provide, reproduction steps. It will be helpful

Alex Snow Developer/System Administrator Datto Inc (203)665-6423 asnow@dattobackup.com

rohan-puri commented 13 years ago

Thanks for the steps. I tried to reproduce it on my end but its NOT reproducible at my end.

gunnarbeutner commented 13 years ago

Steps to reproduce:

  1. zfs snapshot tank@v1
  2. Terminal 1: while true; do ls /tank; done
  3. Terminal 2: zfs rollback tank@v1
behlendorf commented 13 years ago

This looks pretty clear cut, thanks Gunnar I'll get your fix merged in.