openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.64k stars 1.75k forks source link

apt-get/dpkg commands are really slow on a ZFS rootfs #3857

Closed RJVB closed 5 years ago

RJVB commented 9 years ago

This came up on IRC and I've been asked to file an issue about it. It will be mostly a set of observations as I can evidently not easily compare the same distribution on the same computer using various filesystems.

I'm running Kubuntu 14.04 LTS off a ZFS root. I use an EXT3 /boot partition. I originally installed the OS to a FS supported by the installer that comes with the Kubuntu live images, and then rsynced the result onto my ZFS rootfs.

Since then I have tried to tune the pool my reducing copies=2 to the directories that really deserve it, and moving directories with more or less "volatile" content onto datasets that have sync disabled or even also checksums disabled.

One class of operations that has always been much slower since moving to ZFS is installing, removing or upgrading packages with apt-get or dpkg. I think this is related to the database(s) involved, the step where it says "reading database" takes much longer than I'd expect. (Expectation is based here on over all l performance differences with another, faster laptop running the same distribution off a btrfs partition.) But for comparison, upgrading or installing a new kernel with accompanying headers is noticeably faster when I disable sync; progress messages appear in much faster succession as soon as sync is disabled. As a result I put /var/cache/apt on a dataset that has sync and compression disabled (but checksums enabled, and copies=2), which made database access somewhat faster, but still very slow.

My hunch is that dpkg uses database algorithms or settings that are poorly appropriate when the db file is stored on zfs.

FransUrbo commented 9 years ago

It's not dpkg. It's the copies=2 that's the problem… It means that it saves every byte TWICE in two different places. That takes extra time….

Not much you can do, other than create a new filesystem with copies=1 (which kind'a defeats the purpose :).

RJVB commented 9 years ago

On Wednesday September 30 2015 07:15:13 Turbo Fredriksson wrote:

It's not dpkg. It's the copies=2 that's the problem… It means that it saves every byte TWICE in two different places. That takes extra time….

Yes, but that shouldn't affect the time required for reading the database, should it? (One could even imagine how a really clever copies= implementation would potentially be faster by picking the closest of the 2 copies for reading)

R.

ryao commented 9 years ago

It could be that prefetch is hurting us in apt's database. #3765 might help if that is the case.

RJVB commented 9 years ago

On Wednesday September 30 2015 07:59:12 Richard Yao wrote:

It could be that prefetch is hurting us in apt's database. #3765 might help if that is the case.

How safe is that patch, applied over the current stable version for Ubuntu (0.6.5.1-1)?

RJVB commented 9 years ago

another potentially relevant observation that I forgot to mention: it's especially the 1st database traversal (or whatever the "reading database ... XXX files currently installed" operation does) that is really slow. First being "after not having re/installed/upgraded anything for a while". Subsequent calls are much faster, possibly even reasonably fast given my hardware (hard to assess that).

Makes the issue tricky to debug ...

kernelOfTruth commented 9 years ago

@RJVB it's not ready yet since preconditioned (needed) additional changes aren't in ZOL's code, so:

not safe

(the buildbots status give a hint how "safe" pull request mostly are, WIP == work in progress/process)

RJVB commented 9 years ago

On Wednesday September 30 2015 09:36:04 kernelOfTruth aka. kOT, Gentoo user wrote:

@RJVB it's not ready yet since preconditioned (needed) additional aren't in ZOL's code, so:

not safe

(the buildbots status give a hint how "safe" pull request mostly are, WIP == work in progress/process)

Heh, I know what WIP stands for, and saw that the buildbots were running into issues ... but I also saw CentOS, and I tend to equate that with AncientOS ;)

R.

behlendorf commented 9 years ago

@RJVB I suspect this was caused by a recent performance regression which in 0.6.5 and 0.6.5.1. It's been addressed in 0.6.5.2 which is now available. Can you install 0.6.5.2 and see if it addresses the issue.

https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.6.5.2

RJVB commented 9 years ago

On Wednesday September 30 2015 11:39:22 Brian Behlendorf wrote:

@RJVB I suspect this was caused by a recent performance regression which in 0.6.5 and 0.6.5.1. It's been addressed in 0.6.5.2 which is now available. Can you install 0.6.5.2 and see if it addresses the issue.

I didn't repeat it here, but I've always had the slowness in this application. It was one reason I installed my gf's machine on a btrfs root.

I'll install 0.6.5.2 as soon as it appears in apt, and update this ticket accordingly.

R.

FransUrbo commented 9 years ago

@RJVB Did 0.6.5.2 solve the problem? If so, could you close this please?

RJVB commented 9 years ago

On Sunday October 04 2015 04:53:52 Turbo Fredriksson wrote:

@RJVB Did 0.6.5.2 solve the problem? If so, could you close this please?

It's very hard to determine whether the issue is really resolved; as I said before, caching means that apt/dpkg commands that do not involve (un)installing packages complete (much) faster when repeating them after the initial run.

That said, subsequent sudo apt-get update runs complete about 2x faster on the machine running btrfs, which is in line with my experience that the CPU is about twice as fast on that machine as it is in the netbook with the ZFS root. Moving /var/lib/dpkg to a dataset with compression disabled seems to have decrease dpkg-related runtimes somewhat (so much for the idea that writing so disk is so slow that the gain obtained by the reduced file size compensates the CPU time required for compression).

You wouldn't happen to know if the apt database is accessed with a fixed record size, would you?

R.

dweeezil commented 9 years ago

I did a bit of testing recently with the apt/dpkg suite and there are 2 interesting things: first, all access to the database files (pkgcache.bin, etc.) are via mmap() and second, there's a lot of fsync() activity. I did not pursue it any further because typical apt/dpkg commands on the particular test system ran at almost identical speeds regardless of whether /var/cache/apt was on ext4 or on zfs. I did not try moving /var/lib/apt to zfs.

RJVB commented 9 years ago

On Sunday October 04 2015 04:53:52 Turbo Fredriksson wrote:

@RJVB Did 0.6.5.2 solve the problem? If so, could you close this please?

Actually, I don't think it made a difference. I just did an apt-get install of a simple -dev package, and the "Reading database" step after downloading the archive advanced at about 5 seconds per 5% (and told me afterwards I have 473814 files and directories installed).

R.

yhaenggi commented 8 years ago

just a heads up that this still is an issue.

RJVB commented 8 years ago

On Thursday February 25 2016 00:38:43 K1773R wrote:

just a heads up that this still is an issue.

For me it is indeed. Were you asking, or were you saying that it affects you too?

R.

yhaenggi commented 8 years ago

yes, it affects me too.

dweeezil commented 8 years ago

If the slowness is during the "Reading package lists..." phase, the issue is the creation of the /var/cache/apt/pkgcache.bin file. We've deployed a lot of Ubuntu Trusty systems and the apt-get program definitely runs very slow while the temp file is mmapped and being populated:

# grep pkg /proc/21258/maps
7f6218cfc000-7f621a4fc000 rw-s 00000000 00:11 243124              /var/cache/apt/pkgcache.bin.X77Hr5

The slowness only seems to occur when running the stock 3.13 kernel. When we update these systems to use the "Wily enablement stack", which has a 4.2 kernel, the problem goes away.

I don't have any Debian-based testing systems on which to try to track down the problem so I've not pursued it any further.

RJVB commented 8 years ago

On Thursday February 25 2016 05:03:27 Tim Chase wrote:

The slowness only seems to occur when running the stock 3.13 kernel. When we update these systems to use the "Wily enablement stack", which has a 4.2 kernel, the problem goes away.

I'm running (one of) the latest(s) 3.14 kernel(s), so it's not just the 3.13 kernel. I'm currently not able to update beyond 3.14 (= not willing to give up certain software that doesn't support later kernels yet).

R.

aTanCS commented 6 years ago

I have the same issue inside lxc container on zfs pool. reading dpkg database and unpacking packages take ages. iotop shows z_null_int 99% load all the time but dd write is fast. It could be the same issue as #6171. container:

dd if=/dev/zero of=/root/test.txt bs=2048 count=1000000
1000000+0 records in
1000000+0 records out
2048000000 bytes (2.0 GB, 1.9 GiB) copied, 20.2756 s, 101 MB/s

host:

  pool: vmpool
 state: ONLINE
  scan: scrub repaired 12K in 0h5m with 0 errors on Sun Jan 14 00:29:31 2018
config:

    NAME                        STATE     READ WRITE CKSUM
    vmpool                      ONLINE       0     0     0
      mirror-0                  ONLINE       0     0     0
        wwn-0x5000c500a3534c19  ONLINE       0     0     0
        wwn-0x5000c500a3524678  ONLINE       0     0     0
      mirror-1                  ONLINE       0     0     0
        wwn-0x5000c500a3526027  ONLINE       0     0     0
        wwn-0x5000c500a35218bd  ONLINE       0     0     0

errors: No known data errors

host

dpkg -l | grep zfs
ii  libzfs2linux                         0.7.4-pve2~bpo9                amd64        OpenZFS filesystem library for Linux
ii  zfs-initramfs                        0.7.4-pve2~bpo9                all          OpenZFS root filesystem capabilities for Linux - initramfs
ii  zfsutils-linux                       0.7.4-pve2~bpo9                amd64        command-line tools to manage OpenZFS filesystems

container

iozone -n 128M -g 1G -r 16 -O -a C 1
    Iozone: Performance Test of File I/O
            Version $Revision: 3.429 $
        Compiled for 64 bit mode.
        Build: linux-AMD64 

    Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                 Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                 Vangel Bojaxhi, Ben England, Vikentsi Lapa.

    Run began: Wed Jan 31 11:14:18 2018

    Using minimum file size of 131072 kilobytes.
    Using maximum file size of 1048576 kilobytes.
    Record Size 16 kB
    OPS Mode. Output is in operations per second.
    Auto Mode
    Command line used: iozone -n 128M -g 1G -r 16 -O -a C 1
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 kBytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                              random    random     bkwd    record    stride                                    
              kB  reclen    write  rewrite    read    reread    read     write     read   rewrite      read   fwrite frewrite    fread  freread
          131072      16    35504    43888    57747    64994    35141    38663    86044     50897     45671    40054    37652    51210    64318
          262144      16    35945    38430    86103    88087    77852    40302    59836     51114     16688    37669    36340    47860    65142
          524288      16    35390    35174    44695    61146    16254    31436    60745     52621     14921    35688    33951    47900    66149
         1048576      16    34166     6162    46198    62465    14998    25206    57312     54533     14722    35196    31557    48225    66803

iozone test complete.
RJVB commented 6 years ago

On Wednesday January 31 2018 10:35:41 aTanCS wrote:

It could be the same issue as #6171.

That'd be good news, because that issue is apparently fixed. However, I have been seeing this slowness since well before the regression mentioned in the fix commit message.

RJVB commented 5 years ago

This issue continues to plague 0.7.12 (and Linux 4.14). I have been getting a hunch that the /var/lib/dpkg/info directory is a if not the culprit. That dir contains a very large number of tiny files, 2 or 3 per installed package, and apparently those files aren't always removed when packages are uninstalled. I've seen other situations where a key/value database-in-a-flat-directory lead to dismal performance on ZFS (and on FAT16 O:^)).

There must be ways to improve this on the ZFS level (when you can add as much disks and/or RAM as you want) but when you're just (ab?)using ZFS as a safe filesystem on a kickabout Linux notebook those are just not an option.

The best fix is also not really an option: getting the dpkg system to use a proper, fast key/value database (LMDB would probably do).

Here's a workaround that I'm testing, with initial encouraging results:

# create a sparse disk image file; /var/cache is on a dataset with copies=2 compression=lz4 but sync=disabled for speed (at a tiny risk; mostly benefits file deletion)
> sudo truncate -s 512M /var/cache/apt_dpkg.img
# create an XFS filesystem because it's good with small files; no need for checksums at this level (I think)
# ext4 probably works just as well
> sudo mkfs.xfs -b size=512 -m crc=0 -L dpkgStore /var/cache/apt_dpkg.img
# mount to a temp. place and populate the disk image:
> sudo mount -v -o loop /var/cache/apt_dpkg.img /mnt
> (cd /var/lib/dpkg && rsync -aAXH . /mnt)
> sudo umount -v /mnt
# add a line like this to /etc/fstab:
# /var/cache/apt_dpkg.img      /var/lib/dpkg   xfs     rw,noatime,nodiratime,loop      0       0
# make backup, create new mountpoint, mount
> sudo mv /var/lib/dpkg{,.bak} ; sudo mkdir /var/lib/dpkg
> sudo mount -v /var/lib/dpkg

I'm not losing on disk space either:

> du -hcs /var/lib/dpkg.bak /var/cache/apt_dpkg.img /var/lib/dpkg; \
    du -hcs --apparent-size /var/lib/dpkg.bak /var/cache/apt_dpkg.img /var/lib/dpkg
384M    /var/lib/dpkg.bak
95M     /var/cache/apt_dpkg.img
122M    /var/lib/dpkg
599M    total
115M    /var/lib/dpkg.bak
512M    /var/cache/apt_dpkg.img
116M    /var/lib/dpkg
743M    total

We'll see how this evolves over time. For now I went with the easier and safer solution; the alternative would be to set up a sparse zvol with the same properties as the dataset holding the apt and dpkg directories, and create the XFS fs in there. I'm not used to working with ZVOLs, hence the easier and safer (because I'm not putting my pool at risk); it does mean I'll be making very regular syncs of /var/lib/dpkg and /var/lib/dpkg.bak . I also like the fact that it's trivial (provided you keep backups) to recreate the sparse disk image when the existing one starts to degrade.

Edit, some results from the tcsh time built-in:

> time sudo fgrep BLABLA -R /var/lib/dpkg/info/
0.548 user_cpu 0.599 kernel_cpu 0:01.07 total_time 105.6%CPU {7368M 0F 2320R 220I 0O 4w 256c}
> time sudo fgrep BLABLA -R /var/lib/dpkg.bak/info/
0.749 user_cpu 0.856 kernel_cpu 0:01.52 total_time 104.6%CPU {7380M 0F 2329R 188654I 0O 39w 409c}

The big differences here are in the number of input operations (I) and voluntary and involuntary context switches (w and c) all of which are lower with the sparse disk image approach.

bunder2015 commented 5 years ago

Just throwing it out there, but why wouldn't gentoo have the same problems, the portage tree is made up of 163498 files consuming roughly 1.5gb of space.

belperite commented 5 years ago

Doesn't dpkg use fsync as well? Have you tried it with it turned off in ZFS?

On Mon, 7 Jan 2019 at 11:15, bunder2015 notifications@github.com wrote:

Just throwing it out there, but why wouldn't gentoo have the same problems, the portage tree is made up of 163498 files consuming roughly 1.5gb of space.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zfsonlinux/zfs/issues/3857#issuecomment-451902620, or mute the thread https://github.com/notifications/unsubscribe-auth/AgRN17dhiy4PV7xQrkFxqF3rOA6clNaHks5vAyxEgaJpZM4GGhGC .

RJVB commented 5 years ago

why wouldn't gentoo have the same problems, the portage tree is made up of 163498 files consuming roughly 1.5gb of space.

No idea, but it must have something to do with how that tree is accessed. Maybe it does what MacPorts does with its port tree; indexing file(s).

Come to think of it, I sometimes use their port tree repo (github:macports/macports-ports) as a source for benchmarking filesystem operations.

Doesn't dpkg use fsync as well? Have you tried it with it turned off in ZFS?

The slowest part is where it says "Reading database", so I doubt that it can be due to fsync'ing. But as mentioned above the dataset used here already has sync=disabled. Is there another way to "turn fsync off in ZFS"?

belperite commented 5 years ago

sync=disabled should be OK. Does it take the same amount of time no matter how many times the directory is read (so the cache is warm, assuming you're got metadata caching turned on)?

On Mon, 7 Jan 2019 at 11:38, René Bertin notifications@github.com wrote:

why wouldn't gentoo have the same problems, the portage tree is made up of 163498 files consuming roughly 1.5gb of space.

No idea, but it must have something to do with how that tree is accessed. Maybe it does what MacPorts does with its port tree; indexing file(s).

Come to think of it, I sometimes use their port tree repo (github:macports/macports-ports) as a source for benchmarking filesystem operations.

Doesn't dpkg use fsync as well? Have you tried it with it turned off in ZFS?

The slowest part is where it says "Reading database", so I doubt that it can be due to fsync'ing. But as mentioned above the dataset used here already has sync=disabled. Is there another way to "turn fsync off in ZFS"?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zfsonlinux/zfs/issues/3857#issuecomment-451908273, or mute the thread https://github.com/notifications/unsubscribe-auth/AgRN17y_tI2rjKrl40iwt6n4ASYAlqIcks5vAzGwgaJpZM4GGhGC .

RJVB commented 5 years ago

sync=disabled should be OK.

It has been for years :)

Does it take the same amount of time no matter how many times the directory is read (so the cache is warm, assuming you're got metadata caching turned on)?

No, for me the command is really slow only when not executed since a while (since the previous day, for instance).

And that's typical of things like dpkg, you don't use it very often most of the time (but when you do there's a good chance dpkg installs for cache with the packages it installs if you see what I mean).

RJVB commented 5 years ago

Hmm, I think it's more up to (a) ZFS dev(s) to show how ZFS is NOT the cause of this issue - despite the fact that it occurs just about only with ZFS?

Mailing lists aren't really suitable for discussions that drag on for years....