openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.52k stars 1.74k forks source link

L2ARC Capacity and Usage FUBAR - significant performance penalty apparently associated #3400

Closed sempervictus closed 9 years ago

sempervictus commented 9 years ago

While running #3189 i came across something very strange - the L2 ARC has developed magical powers and turned a 64G SSD into a bottomless pit of data. This in turn has resulted in crushing performance degradation and apparent demonic possession of the SCST host running this export.

My assumption is that the following output is based on the number of times L2ARC has wrapped around:

zpool list -v
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
(omitted since this is a client system and the underlying VDEVs are named with serial numbers)
...
...
cache      -      -      -      -      -      -
  dm-name-crypt_2718620040  59.6G   667G  16.0E         -     0%  1118%

Reads from the pool are suffering badly, IOWait on a send operation is >50%. About to drop the cache device and hope it gets fixed, but figured i should put this up here for now.

behlendorf commented 9 years ago

Thanks guys for running this to ground. The fix in #3451 looks good to me although I noticed that despite posive reviews for illumos and originally be written for FreeBSD neither platform has merged it yet. Anybody happen to know what's holding things up? It would be nice to have an illumos issue number for this to make tracking it easier.

kernelOfTruth commented 9 years ago

@behlendorf Glad the l2arc works correctly now (even for everyday tasks it makes a remarkable difference for me - so it would a shame to have to disable it, especially for bigger workloads https://github.com/zfsonlinux/zfs/issues/3259#issuecomment-92260770 )

from looking on the web I got the impression that it (the issue) might not have gotten wide enough recognition to seek for solutions from users side yet and/or it hasn't been triggered often enough to be "relevant" (perhaps they're not running that stressing workloads ?)

Why there's no activity on FreeBSD or Illumos - that I can't say

sempervictus commented 9 years ago

Just to chime in, i've been using #3451 since it hit, successfully, on multiple hosts, including contentious SCST boxes feeding pairs of 40Gbit links on which a cloudstack resides, all of which beat the hell out of ARC. @kernelOfTruth: you've been an immense help with the recent patch stacks for testing and hunting this down. Thank you very much.

avg-I commented 9 years ago

@behlendorf @kernelOfTruth regarding the illumos / FreeBSD (in-)activity: I made these changes for HybridCluster / ClusterHQ and we ran with them for many month while our product was based on FreeBSD. After submitting the changes to illumos and FreeBSD I switched to working on other things and didn't have time to follow through. Apparently noone else was as interested in the changes at that time.

Fortunately, now there are interested parties :) and I've got a little bit more time: https://reviews.freebsd.org/D2764 https://reviews.freebsd.org/D2789

One thing I noticed is that there were recent significant changes to ARC code in illumos: illumos/illumos-gate@89c86e32293a30cdd7af530c38b2073fee01411c illumos/illumos-gate@244781f10dcd82684fd8163c016540667842f203 <- this one is big illumos/illumos-gate@2fd872a734cf486007a8dba532cec52bfb4d40e5 illumos/illumos-gate@a52fc310ba80fa3b2006110936198de7f828cd94

So I am looking forward to fun times of merging those changes with mine.

behlendorf commented 9 years ago

@avg-I thanks for following up with us! It would be great if you refresh and finalize the patches so fix this issue which clearly impacts everybody. We've just finished pulling in those significant ARC changes so it should be fairly easy for us to apply the fix.

avg-I commented 9 years ago

@behlendorf #3491 is equivalent to what I have

kernelOfTruth commented 9 years ago

@avg-I one of your patches is prior to be being merged ( https://github.com/zfsonlinux/zfs/pull/3521 )

Thanks again for your hard work =)

avg-I commented 9 years ago

@kernelOfTruth thank you too!