gmzang / maczfs

Automatically exported from code.google.com/p/maczfs
Other
0 stars 0 forks source link

Hang with Large Amounts of I/O (spotlight) #77

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
In attempting to copy about 500 GB of information to a new ZFS pool (using 
ditto or MacPorts rsync - I tried both), I repeatedly encountered a filesystem 
hang: any operation related to the pool other than getting information about it 
would hang, and the copy of files hung as well, leaving kernel_task taking 100% 
CPU (out of the possible 2400% that could be reported by Activity Monitor on my 
machine; i.e., kernel_task was pegging one CPU core, single-threadedly) and the 
copying processes idle. 

Operations that would work during the hang: zpool status, zfs get
Operations that would not: zpool scrub, zpool export, Finder eject, restarting 
the machine (would quit all processes and then spin forever)

After encountering this a few times I decided to exclude the pool from 
Spotlight indexing - the only other thing on the system that would access the 
pool - while doing my massive file copy; after excluding the pool, the file 
copy completed successfully. Subsequent enabling of Spotlight successfully 
indexed the pool.

It is unclear to me what kind of debugging information would be useful to 
narrow down what I presume must be a livelock of some sort (given the CPU 
usage) in the filesystem code... but I'm happy to try (by copying another 
massive chunk of information to the pool) if somebody suggests what I should 
try to capture.

Original issue reported on code.google.com by dmz...@gmail.com on 22 Jan 2011 at 3:42

GoogleCodeExporter commented 8 years ago
As a note, the same hang occurred using both 74.0.1 and 75.0.10 (which I tried 
in the vain hope of maybe getting rid of the hang); I was sufficiently wary of 
kernel panics that I did not try with the 77 build.

Original comment by dmz...@gmail.com on 22 Jan 2011 at 3:52

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Indeed, it only happens on a root pool; previously, when I had created a number 
of child filesystems and "ditto"d to those, there was no hang.

Original comment by dmz...@gmail.com on 24 Jan 2011 at 7:51

GoogleCodeExporter commented 8 years ago
This same hang also seems to happen with Spotlight on my pool when I am not 
copying huge amounts of data to it; today, I found both mds and kernel_task 
using 100% CPU (each), and most filesystem operations (such as copying files 
from a remote site) just hang on the ZFS pool (while still working on my HFS 
filesystems).

Original comment by dmz...@gmail.com on 29 Jan 2011 at 9:04

GoogleCodeExporter commented 8 years ago
Not certain if this is related, but I had a kernel panic today on this system 
that was clearly ZFS's fault. This is the panic report:

Interval Since Last Panic Report:  604957 sec
Panics Since Last Report:          1
Anonymous UUID:                    EF0DB43C-35E7-4E23-80E5-E7A4DA2F8F62

Tue Feb  1 19:56:54 2011
panic(cpu 16 caller 0xffffff7f813a47f6): "mutex_enter: locking against 
myself!"@/Users/alex/Projects/MacZFS/usr/src/maczfs/kernel/zfs_context.c:448
Backtrace (CPU 16), Frame : Return Address
0xffffff80b1d53a00 : 0xffffff8000204b99 
0xffffff80b1d53b00 : 0xffffff7f813a47f6 
0xffffff80b1d53b20 : 0xffffff7f8138e46d 
0xffffff80b1d53b60 : 0xffffff800023d35b 
0xffffff80b1d53b90 : 0xffffff80002379d7 
0xffffff80b1d53be0 : 0xffffff8000237b59 
0xffffff80b1d53c20 : 0xffffff8000237e54 
0xffffff80b1d53c40 : 0xffffff8000237e8c 
0xffffff80b1d53c60 : 0xffffff7f8139469e 
0xffffff80b1d53d30 : 0xffffff7f8138b856 
0xffffff80b1d53e90 : 0xffffff7f8138b9ef 
0xffffff80b1d53ed0 : 0xffffff80002fbe52 
0xffffff80b1d53f40 : 0xffffff80004e0f44 
0xffffff80b1d53fa0 : 0xffffff80002e2944 
      Kernel Extensions in backtrace (with dependencies):
         com.bandlem.mac.zfs.fs(75.0.10)@0xffffff7f81351000->0xffffff7f813b8fff

BSD process name corresponding to current thread: mds

Mac OS version:
10J567

Kernel version:
Darwin Kernel Version 10.6.0: Wed Nov 10 18:11:58 PST 2010; 
root:xnu-1504.9.26~3/RELEASE_X86_64
System model name: MacPro5,1 (Mac-F221BEC8)

System uptime in nanoseconds: 54019957869461
unloaded kexts:
com.apple.driver.AppleIntel8254XEthernet    2.1.1b7 (addr 0xffffff7f80fb5000, size 
0x122880) - last unloaded 70894946509
loaded kexts:
com.bandlem.mac.zfs.fs  75.0.10
at.obdev.nke.LittleSnitch   2.2.05
com.apple.filesystems.afpfs 9.7 - last loaded 3632434385600
com.apple.nke.asp_tcp   5.0
com.apple.filesystems.autofs    2.1.0
com.apple.driver.AppleHWSensor  1.9.3d0
com.apple.driver.AppleUpstreamUserClient    3.4.5
com.apple.driver.AppleMCCSControl   1.0.17
com.apple.driver.AppleTyMCEDriver   1.0.2d2
com.apple.driver.AGPM   100.12.19
com.apple.driver.AppleMikeyHIDDriver    1.2.0
com.apple.kext.ATIFramebuffer   6.2.6
com.apple.driver.AppleHDA   1.9.9f12
com.apple.driver.AppleMikeyDriver   1.9.9f12
com.apple.ATIRadeonX3000    6.2.6
com.apple.driver.AudioAUUC  1.13
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.driver.AudioIPCDriver 1.1.6
Model: MacPro5,1, BootROM MP51.007F.B03, 12 processors, 6-Core Intel Xeon, 2.66 
GHz, 8 GB, SMC 1.39f11
Graphics: ATI Radeon HD 5870, ATI Radeon HD 5870, PCIe, 1024 MB
Memory Module: global_name
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0x8E), Broadcom 
BCM43xx 1.0 (5.10.131.36.1)
Bluetooth: Version 2.3.8f7, 2 service, 19 devices, 1 incoming serial ports
Network Service: Ethernet 1, Ethernet, en0
PCI Card: ATI Radeon HD 5870, Display, Slot-1
Serial ATA Device: HL-DT-ST DVD-RW GH61N
Serial ATA Device: INTEL SSDSA2M080G2GC, 74.53 GB
Serial ATA Device: WDC WD20EARS-00MVWB0, 1.82 TB
Serial ATA Device: WDC WD20EARS-00MVWB0, 1.82 TB
Serial ATA Device: WDC WD20EARS-00MVWB0, 1.82 TB
Serial ATA Device: WDC WD20EARS-00MVWB0, 1.82 TB
USB Device: BRCM2046 Hub, 0x0a5c  (Broadcom Corp.), 0x4500, 0x5a100000
USB Device: Bluetooth USB Host Controller, 0x05ac  (Apple Inc.), 0x8215, 
0x5a110000
USB Device: USB Receiver, 0x046d  (Logitech Inc.), 0xc506, 0x1a200000
USB Device: UC-100KMA, 0x0557  (ATEN International Co. Ltd.), 0x0204, 0x3d100000
FireWire Device: built-in_hub, Up to 800 Mb/sec

Original comment by dmz...@gmail.com on 2 Feb 2011 at 5:03

GoogleCodeExporter commented 8 years ago
Results of "panic-decode" for this panic:

195654_babylon5.panic
0xffffff8000204b99 <panic+608>: mov    0x41e6b5(%rip),%esi        # 
0xffffff8000623254 <panic_is_inited>
0xffffff7f813a47f6 <mutex_enter+48>:    add    %al,(%rax)
0xffffff7f8138e46d <zfs_vnop_reclaim+90>:   sbb    %rax,%rax
0xffffff800023d35b <VNOP_RECLAIM+44>:   leaveq 
0xffffff80002379d7 <mount_ref+933>: test   %eax,%eax
0xffffff8000237b59 <mount_ref+1319>:    movzwl 0x70(%r12),%eax
0xffffff8000237e54 <vnode_put_locked+161>:  mov    %rbx,%rdi
0xffffff8000237e8c <vnode_put+33>:  mov    %eax,%r12d
0xffffff7f8139469e <zfs_zget_internal+482>: in     $0x48,%eax
0xffffff7f8138b856 <zfs_vget_internal+100>: add    (%rax),%al
0xffffff7f8138b9ef <zfs_vfs_vget+118>:  push   %rbp
0xffffff80002fbe52 <munge_user64_stat+492>: mov    %eax,%ebx
0xffffff80004e0f44 <unix_syscall64+544>:    mov    %eax,%r12d
0xffffff80002e2944 <hndl_unix_scall64+20>:  mov    %r12,%rsp

Original comment by dmz...@gmail.com on 2 Feb 2011 at 5:07

GoogleCodeExporter commented 8 years ago
I think spotlight is causing thrashing which is exposing a problem. If you have 
a child filesystem which spotlight won't see, that may be less problematic.

I need to investigate the callpath further at some point; the best I can 
suggest is to disable spotlight on the root pool to mitigate problems, or 
disable it prior to a large file op. You could turn off automatic indexing but 
then run the mdimporter periodically (say, nightly) to get a sort-of-up-to-date 
index list.

Original comment by alex.ble...@gmail.com on 16 Feb 2011 at 10:21

GoogleCodeExporter commented 8 years ago

Original comment by alex.ble...@gmail.com on 16 Feb 2011 at 10:21

GoogleCodeExporter commented 8 years ago
MacZFS has been discontinued.  Please switch to https://openzfsonosx.org/

Original comment by googlelogin@bjoern-kahl.de on 28 Jul 2015 at 10:01