Open kiyoon opened 8 months ago
Indeed second run is faster due to size caching, have you tried, as you don't seem to wanna know whats inside the dirs, to use the -d
option that will handle them as if they were files ?
It might help with the performances. Secondly, for your info eza is multithreaded for output calculation, so it might also be due to your processor not handling it right, or not giving that much threads.
Could you maybe specify how it takes? On my machine, it consistently takes about 100ms, which isn't all too bad. Even for 5000 files, I don't think the check for empty folders is super expensive, so I have a feeling something more is going on, like Martin already suggested. It would be awesome if you could provide some more info, like a flamegraph or a samply profile. Or maybe the first few lines of running strace -c eza --icons=auto
. That should allow the devs to diagnose this issue a bit faster.
@tertsdiepraam Thanks for the suggestions. With 10164 files (no folders) it took 3m49s. Here's the strace
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ------------------
75.04 0.889321 87 10165 statx
17.25 0.204413 20 10164 10164 readlink
6.40 0.075896 7 10164 getcwd
1.07 0.012675 9 1271 write
0.12 0.001390 99 14 getdents64
0.03 0.000408 3 133 125 openat
0.02 0.000239 9 25 mmap
0.01 0.000151 10 15 brk
0.01 0.000137 27 5 mremap
0.01 0.000130 14 9 mprotect
0.01 0.000097 8 11 read
0.00 0.000052 6 8 close
0.00 0.000044 7 6 ioctl
0.00 0.000035 5 6 rt_sigaction
0.00 0.000035 0 77 65 newfstatat
0.00 0.000024 4 5 munmap
0.00 0.000024 6 4 pread64
0.00 0.000012 6 2 prlimit64
0.00 0.000011 3 3 sigaltstack
0.00 0.000010 10 1 poll
0.00 0.000009 9 1 sched_getaffinity
0.00 0.000007 7 1 set_robust_list
0.00 0.000007 7 1 rseq
0.00 0.000006 3 2 1 arch_prctl
0.00 0.000006 3 2 getrandom
0.00 0.000005 5 1 set_tid_address
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
------ ----------- ----------- --------- --------- ------------------
100.00 1.185144 36 32098 10356 total
The second run was faster (~1s)
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ------------------
45.50 0.201222 19 10164 10164 readlink
29.26 0.129386 12 10165 statx
17.49 0.077328 7 10164 getcwd
6.89 0.030484 11 2541 write
0.46 0.002016 144 14 getdents64
0.20 0.000897 6 133 125 openat
0.07 0.000288 11 25 mmap
0.06 0.000279 3 77 65 newfstatat
0.03 0.000122 24 5 mremap
0.02 0.000079 5 15 brk
0.01 0.000027 2 11 read
0.01 0.000026 3 8 close
0.01 0.000026 2 9 mprotect
0.00 0.000014 3 4 pread64
0.00 0.000004 4 1 set_robust_list
0.00 0.000003 3 1 set_tid_address
0.00 0.000002 1 2 1 arch_prctl
0.00 0.000002 2 1 rseq
0.00 0.000000 0 1 poll
0.00 0.000000 0 5 munmap
0.00 0.000000 0 6 rt_sigaction
0.00 0.000000 0 6 ioctl
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 3 sigaltstack
0.00 0.000000 0 1 sched_getaffinity
0.00 0.000000 0 2 prlimit64
0.00 0.000000 0 2 getrandom
------ ----------- ----------- --------- --------- ------------------
100.00 0.442205 13 33368 10356 total
Open these with https://profiler.firefox.com
samply profile with eza --icons=always
:
profile.json
samply profile with exa --icons
profile.json
I made sure I ran those on a new directory with pretty much similar files to avoid caching.
That's great (the info you provided, not the performance 😄)!
3 minutes for a few statx calls definitely seems strange. That's gotta be quite a slow drive then? Nevertheless, I think this ties in with my PR here, which aims to reduce the number of statx
calls, but is a WIP: https://github.com/eza-community/eza/pull/833
However, the fact that exa
is much faster is still strange, because I think it will roughly do the same amount of statx
calls. Are you sure that difference was not due to filesystem cache?
note that when using eza --icons
(same as eza --icons=auto
), icons will be disabled if stdout is not a terminal (e.g. eza --icons > /dev/null
will turn off icons)
With many folders (~5000), eza takes a while to print with icons. It is much slower than exa, and I think it's because eza also checks if the directory is empty.
Depending on the filesystem, this can be optimized. For example BTRFS shows the size of a directory as 0 if it is empty, on ZFS it is 2 if empty (because of the . and .. I'm assuming), and on XFS it is 6 (directory entry header I guess? this might change depending on the on-disk format version and the number of files that has previously existed there) if empty. Does also seem to work over NFS, but filesystem detection would need to account for that.
diff --git a/src/fs/file.rs b/src/fs/file.rs
index ba2703d9..7e3c31dd 100644
--- a/src/fs/file.rs
+++ b/src/fs/file.rs
@@ -666,6 +666,8 @@ impl<'dir> File<'dir> {
/// based on directory size alone.
#[cfg(unix)]
pub fn is_empty_dir(&self) -> bool {
+ use std::ffi::CString;
+
if self.is_directory() {
if self.metadata.nlink() > 2 {
// Directories will have a link count of two if they do not have any subdirectories.
@@ -675,6 +677,25 @@ impl<'dir> File<'dir> {
// has subdirectories.
false
} else {
+ unsafe {
+ let Some(str_path) = self.path.to_str() else {
+ return self.is_empty_directory();
+ };
+ let Ok(c_path) = CString::new(str_path) else {
+ return self.is_empty_directory();
+ };
+
+ let mut statfs_res: libc::statfs = std::mem::zeroed();
+ if libc::statfs(c_path.as_ptr().cast::<libc::c_char>(), &mut statfs_res) == 0 {
+ match statfs_res.f_type {
+ 0x9123_683e => return self.metadata.size() == 0, // BTRFS
+ 0x2fc1_2fc1 => return self.metadata.size() == 2, // ZFS
+ 0x5846_5342 => return self.metadata.size() == 6, // XFS
+ _ => (),
+ };
+ };
+ };
+
self.is_empty_directory()
}
} else {
(doesn't work on EXT4 because the directory size seems to be 4096 even with a few files)
Only problem with this patch is that it is ever so slightly slower on non-BTRFS/ZFS/XFS. Maybe users could opt-in with a flag?
ref @BrianCArnold from #745
eza --version
) 0.18.3eza --icons auto
With many folders (~5000), eza takes a while to print with icons. It is much slower than
exa
, and I think it's becauseeza
also checks if the directory is empty.From the second run, it runs faster maybe due to caching of the file system.
Possible solutions:
The files look like this: