eza-community / eza

A modern alternative to ls
https://eza.rocks
European Union Public License 1.2
12.12k stars 216 forks source link

bug: slow performance with many directories with files included #844

Open kiyoon opened 8 months ago

kiyoon commented 8 months ago

With many folders (~5000), eza takes a while to print with icons. It is much slower than exa, and I think it's because eza also checks if the directory is empty.

From the second run, it runs faster maybe due to caching of the file system.

Possible solutions:

  1. timeout option to detect how long it takes and bypass icons
  2. option to bypass empty directory searching
  3. improve performance on empty directory checks, if possible

The files look like this:
image

MartinFillon commented 8 months ago

Indeed second run is faster due to size caching, have you tried, as you don't seem to wanna know whats inside the dirs, to use the -d option that will handle them as if they were files ? It might help with the performances. Secondly, for your info eza is multithreaded for output calculation, so it might also be due to your processor not handling it right, or not giving that much threads.

tertsdiepraam commented 8 months ago

Could you maybe specify how it takes? On my machine, it consistently takes about 100ms, which isn't all too bad. Even for 5000 files, I don't think the check for empty folders is super expensive, so I have a feeling something more is going on, like Martin already suggested. It would be awesome if you could provide some more info, like a flamegraph or a samply profile. Or maybe the first few lines of running strace -c eza --icons=auto. That should allow the devs to diagnose this issue a bit faster.

kiyoon commented 8 months ago

@tertsdiepraam Thanks for the suggestions. With 10164 files (no folders) it took 3m49s. Here's the strace

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 75.04    0.889321          87     10165           statx
 17.25    0.204413          20     10164     10164 readlink
  6.40    0.075896           7     10164           getcwd
  1.07    0.012675           9      1271           write
  0.12    0.001390          99        14           getdents64
  0.03    0.000408           3       133       125 openat
  0.02    0.000239           9        25           mmap
  0.01    0.000151          10        15           brk
  0.01    0.000137          27         5           mremap
  0.01    0.000130          14         9           mprotect
  0.01    0.000097           8        11           read
  0.00    0.000052           6         8           close
  0.00    0.000044           7         6           ioctl
  0.00    0.000035           5         6           rt_sigaction
  0.00    0.000035           0        77        65 newfstatat
  0.00    0.000024           4         5           munmap
  0.00    0.000024           6         4           pread64
  0.00    0.000012           6         2           prlimit64
  0.00    0.000011           3         3           sigaltstack
  0.00    0.000010          10         1           poll
  0.00    0.000009           9         1           sched_getaffinity
  0.00    0.000007           7         1           set_robust_list
  0.00    0.000007           7         1           rseq
  0.00    0.000006           3         2         1 arch_prctl
  0.00    0.000006           3         2           getrandom
  0.00    0.000005           5         1           set_tid_address
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
------ ----------- ----------- --------- --------- ------------------
100.00    1.185144          36     32098     10356 total

The second run was faster (~1s)

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 45.50    0.201222          19     10164     10164 readlink
 29.26    0.129386          12     10165           statx
 17.49    0.077328           7     10164           getcwd
  6.89    0.030484          11      2541           write
  0.46    0.002016         144        14           getdents64
  0.20    0.000897           6       133       125 openat
  0.07    0.000288          11        25           mmap
  0.06    0.000279           3        77        65 newfstatat
  0.03    0.000122          24         5           mremap
  0.02    0.000079           5        15           brk
  0.01    0.000027           2        11           read
  0.01    0.000026           3         8           close
  0.01    0.000026           2         9           mprotect
  0.00    0.000014           3         4           pread64
  0.00    0.000004           4         1           set_robust_list
  0.00    0.000003           3         1           set_tid_address
  0.00    0.000002           1         2         1 arch_prctl
  0.00    0.000002           2         1           rseq
  0.00    0.000000           0         1           poll
  0.00    0.000000           0         5           munmap
  0.00    0.000000           0         6           rt_sigaction
  0.00    0.000000           0         6           ioctl
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         3           sigaltstack
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         2           prlimit64
  0.00    0.000000           0         2           getrandom
------ ----------- ----------- --------- --------- ------------------
100.00    0.442205          13     33368     10356 total
kiyoon commented 8 months ago

Open these with https://profiler.firefox.com

samply profile with eza --icons=always:
profile.json

samply profile with exa --icons profile.json

I made sure I ran those on a new directory with pretty much similar files to avoid caching.

tertsdiepraam commented 8 months ago

That's great (the info you provided, not the performance 😄)!

3 minutes for a few statx calls definitely seems strange. That's gotta be quite a slow drive then? Nevertheless, I think this ties in with my PR here, which aims to reduce the number of statx calls, but is a WIP: https://github.com/eza-community/eza/pull/833

However, the fact that exa is much faster is still strange, because I think it will roughly do the same amount of statx calls. Are you sure that difference was not due to filesystem cache?

ErrorNoInternet commented 5 months ago

note that when using eza --icons (same as eza --icons=auto), icons will be disabled if stdout is not a terminal (e.g. eza --icons > /dev/null will turn off icons)

With many folders (~5000), eza takes a while to print with icons. It is much slower than exa, and I think it's because eza also checks if the directory is empty.

Depending on the filesystem, this can be optimized. For example BTRFS shows the size of a directory as 0 if it is empty, on ZFS it is 2 if empty (because of the . and .. I'm assuming), and on XFS it is 6 (directory entry header I guess? this might change depending on the on-disk format version and the number of files that has previously existed there) if empty. Does also seem to work over NFS, but filesystem detection would need to account for that.

diff --git a/src/fs/file.rs b/src/fs/file.rs
index ba2703d9..7e3c31dd 100644
--- a/src/fs/file.rs
+++ b/src/fs/file.rs
@@ -666,6 +666,8 @@ impl<'dir> File<'dir> {
     /// based on directory size alone.
     #[cfg(unix)]
     pub fn is_empty_dir(&self) -> bool {
+        use std::ffi::CString;
+
         if self.is_directory() {
             if self.metadata.nlink() > 2 {
                 // Directories will have a link count of two if they do not have any subdirectories.
@@ -675,6 +677,25 @@ impl<'dir> File<'dir> {
                 // has subdirectories.
                 false
             } else {
+                unsafe {
+                    let Some(str_path) = self.path.to_str() else {
+                        return self.is_empty_directory();
+                    };
+                    let Ok(c_path) = CString::new(str_path) else {
+                        return self.is_empty_directory();
+                    };
+
+                    let mut statfs_res: libc::statfs = std::mem::zeroed();
+                    if libc::statfs(c_path.as_ptr().cast::<libc::c_char>(), &mut statfs_res) == 0 {
+                        match statfs_res.f_type {
+                            0x9123_683e => return self.metadata.size() == 0, // BTRFS
+                            0x2fc1_2fc1 => return self.metadata.size() == 2, // ZFS
+                            0x5846_5342 => return self.metadata.size() == 6, // XFS
+                            _ => (),
+                        };
+                    };
+                };
+
                 self.is_empty_directory()
             }
         } else {

(doesn't work on EXT4 because the directory size seems to be 4096 even with a few files)

Only problem with this patch is that it is ever so slightly slower on non-BTRFS/ZFS/XFS. Maybe users could opt-in with a flag?

ref @BrianCArnold from #745