Byron / dua-cli

View disk space usage and delete unwanted data, fast.
https://lib.rs/crates/dua-cli
MIT License
4.19k stars 113 forks source link

Hugh performance difference between dua v2.10.7 and v2.11.0 #79

Closed c02y closed 3 years ago

c02y commented 3 years ago

I installed v2.10.7 from Manjaro repo since it is the latest version from Manjaro testing branch v2.11.0 is just released so I tested it and notice it takes much more time than the v2.10.7

$ time dua  /*
      0  B /dev
      0  B /lost+found
      0  B /mnt
      0  B /proc  <142 IO Errors>
      0  B /root
      0  B /srv
      0  B /sys
   4.10 KB /snap
   8.19 KB /rootfs-pkgs.txt
  24.58 KB /desktopfs-pkgs.txt
   1.99 MB /run
  17.85 MB /etc
 105.83 MB /boot
 178.76 MB /tmp
   1.12 GB /opt
   1.32 GB /sbin
   1.32 GB /bin
   6.80 GB /lib
   6.80 GB /lib64
   6.83 GB /var
  14.57 GB /usr
  18.25 GB /swapfile
 584.88 GB /home
 642.19 GB total  <142 IO Errors>

________________________________________________________
Executed in  872.79 millis    fish           external
   usr time    2.69 secs  444.00 micros    2.69 secs
   sys time    7.85 secs   60.00 micros    7.85 secs

$ time ./dua-v2.11.0-x86_64-unknown-linux-musl/dua  /*
      0  B /dev
      0  B /lost+found
      0  B /mnt
      0  B /proc  <67 IO Errors>
      0  B /root
      0  B /srv
      0  B /sys
   4.10 KB /snap
   8.19 KB /rootfs-pkgs.txt
  24.58 KB /desktopfs-pkgs.txt
   1.99 MB /run
  17.85 MB /etc
 105.83 MB /boot
 178.76 MB /tmp
   1.12 GB /opt
   1.32 GB /sbin
   1.32 GB /bin
   6.80 GB /lib
   6.80 GB /lib64
   6.83 GB /var
  14.57 GB /usr
  18.25 GB /swapfile
 584.88 GB /home
 642.19 GB total  <67 IO Errors>

________________________________________________________
Executed in    3.28 secs   fish           external
   usr time   25.38 secs  438.00 micros   25.38 secs
   sys time   16.32 secs   56.00 micros   16.32 secs

$ hyperfine "./dua-v2.11.0-x86_64-unknown-linux-musl/dua /*" "dua /*" -i
Benchmark #1: ./dua-v2.11.0-x86_64-unknown-linux-musl/dua /*
  Time (mean ± σ):      3.530 s ±  0.131 s    [User: 25.066 s, System: 18.070 s]
  Range (min … max):    3.364 s …  3.703 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark #2: dua /*
  Time (mean ± σ):     977.0 ms ±  19.1 ms    [User: 2.933 s, System: 8.668 s]
  Range (min … max):   954.6 ms … 1008.0 ms    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  'dua /*' ran
    3.61 ± 0.15 times faster than './dua-v2.11.0-x86_64-unknown-linux-musl/dua /*'

I also tested v2.11.0 and v2.10.10:

$ hyperfine "./dua-v2.11.0-x86_64-unknown-linux-musl/dua /*" "./dua-v2.10.10-x86_64-unknown-linux-musl/dua /*" -i
Benchmark #1: ./dua-v2.11.0-x86_64-unknown-linux-musl/dua /*
  Time (mean ± σ):      3.701 s ±  0.361 s    [User: 26.297 s, System: 19.272 s]
  Range (min … max):    3.362 s …  4.357 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark #2: ./dua-v2.10.10-x86_64-unknown-linux-musl/dua /*
  Time (mean ± σ):      4.242 s ±  0.056 s    [User: 29.691 s, System: 22.055 s]
  Range (min … max):    4.175 s …  4.311 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  './dua-v2.11.0-x86_64-unknown-linux-musl/dua /*' ran
    1.15 ± 0.11 times faster than './dua-v2.10.10-x86_64-unknown-linux-musl/dua /*'

Don't know if you notice this issue

Byron commented 3 years ago

Thanks a lot for letting me know. This is new to me, and I will see if I can reproduce this despite being on very different hardware by now that does have different performance characteristics.

If so, that should be an easy fix.

Looking at the history there doesn't seem to be an obvious candidate for a regression, but nothing that a bisect can't find.

Screenshot 2021-02-17 at 10 30 40
Byron commented 3 years ago

Here is what I am getting:

 hyperfine './dua-2.10.7 /Applications/*' './dua-2.11.0 /Applications/*'
Benchmark #1: ./dua-2.10.7 /Applications/*
  Time (mean ± σ):      1.068 s ±  0.201 s    [User: 749.0 ms, System: 4022.0 ms]
  Range (min … max):    0.998 s …  1.641 s    10 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (1.641 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.

Benchmark #2: ./dua-2.11.0 /Applications/*
  Time (mean ± σ):      1.060 s ±  0.149 s    [User: 747.6 ms, System: 4040.9 ms]
  Range (min … max):    1.001 s …  1.483 s    10 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (1.483 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.

Summary
  './dua-2.11.0 /Applications/*' ran
    1.01 ± 0.24 times faster than './dua-2.10.7 /Applications/*'

Thus for now the issue can't be reproduced, at least when running on MacOS. I did update dependencies before running the latest version, but that doesn't have any effect on performance.

Byron commented 3 years ago

Maybe you can run a git bisect with hyperfine to pinpoint the issue.

c02y commented 3 years ago

It is weird, v2.10.7 installed from Manjaro repo is much faster than the same version binary downloaded from github release page:

$ hyperfine "./dua-v2.10.7-x86_64-unknown-linux-musl/dua /*" "./dua-v2.11.0-x86_64-unknown-linux-musl/dua /*" "dua /*" -i
Benchmark #1: ./dua-v2.10.7-x86_64-unknown-linux-musl/dua /*
  Time (mean ± σ):      3.471 s ±  0.049 s    [User: 24.428 s, System: 19.024 s]
  Range (min … max):    3.394 s …  3.525 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark #2: ./dua-v2.11.0-x86_64-unknown-linux-musl/dua /*
  Time (mean ± σ):      3.264 s ±  0.045 s    [User: 22.855 s, System: 17.679 s]
  Range (min … max):    3.180 s …  3.316 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark #3: dua /*
  Time (mean ± σ):     897.7 ms ±  37.9 ms    [User: 2.696 s, System: 8.462 s]
  Range (min … max):   848.1 ms … 961.5 ms    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  'dua /*' ran
    3.64 ± 0.16 times faster than './dua-v2.11.0-x86_64-unknown-linux-musl/dua /*'
    3.87 ± 0.17 times faster than './dua-v2.10.7-x86_64-unknown-linux-musl/dua /*'
Byron commented 3 years ago

It's odd indeed, maybe the one in Manjaro is built with processor specific optimisations enabled, whereas the one on GitHub is generic assuming no particular processor type/features.

However, dua is clearly bound by syscalls and not by user space CPU, so that shouldn't be much of a difference.

Maybe try to compile it on your machine to get more samples. If it's slow(er), it's probably something done by the packagers.

c02y commented 3 years ago

Sorry about one thing, I might use the wrong dua from my PATH when I said it was installed from Manjaro repo, since I found dua binaries both in my /bin/ and ~/.cargo/bin (~/.cargo/bin/dua is used first), and I installed and uninstalled dua-cli package from Manjaro repo multiple times, I don't know which dua was used when I did the test, so I post the data again with more details:

>> hyperfine "/bin/dua /*" "~/.cargo/bin/dua /*" "~/dua-v2.11.0-x86_64-unknown-linux-musl/dua /*" -i
Benchmark #1: /bin/dua /*
  Time (mean ± σ):     886.4 ms ±  33.8 ms    [User: 2.703 s, System: 8.390 s]
  Range (min … max):   853.6 ms … 951.4 ms    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark #2: ~/.cargo/bin/dua /*
  Time (mean ± σ):     893.5 ms ±  19.4 ms    [User: 2.693 s, System: 8.534 s]
  Range (min … max):   862.7 ms … 919.4 ms    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark #3: ~/dua-v2.11.0-x86_64-unknown-linux-musl/dua /*
  Time (mean ± σ):      3.920 s ±  0.062 s    [User: 26.043 s, System: 21.297 s]
  Range (min … max):    3.850 s …  4.008 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  '/bin/dua /*' ran
    1.01 ± 0.04 times faster than '~/.cargo/bin/dua /*'
    4.42 ± 0.18 times faster than '~/dua-v2.11.0-x86_64-unknown-linux-musl/dua /*'

>> /bin/dua v2.10.7

>> ~/.cargo/bin/dua --version
dua 2.11.0

>> file /bin/dua ~/.cargo/bin/dua ~/dua-v2.11.0-x86_64-unknown-linux-musl/dua
/bin/dua:                                            ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1b866af359e74f05b94b43356052b37779ea508c, for GNU/Linux 3.2.0, stripped
/home/chz/.cargo/bin/dua:                            ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=3f0a23a1481aedb2fffb5deb1ba41d3273b54d92, for GNU/Linux 3.2.0, with debug_info, not stripped
/home/chz/dua-v2.11.0-x86_64-unknown-linux-musl/dua: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

I'll just use the cargo version then.

I don't understand the full details of the differences of file command of the three binaries though.