webdesus / fs_extra

Expanding opportunities standard library std::fs and std::io
MIT License
297 stars 47 forks source link

`dir::get_size` returns wrong size for symlinks #59

Closed edmorley closed 2 years ago

edmorley commented 2 years ago

Summary

If the directory contains symlinks, then dir::get_size follows the symlinks and counts the sizes of those files too. If the targets of those symlinks are inside that same directory structure, then it effectively means the file sizes of some files are counted twice. Or if the targets of those symlinks are not inside that directory structure, then it means the returned size isn't really that of the requested directory, but also includes that of arbitrary other locations on the filesystem, which doesn't seem desirable.

Steps to reproduce

  1. mkdir test-dir
  2. truncate -s 1M test-dir/large-file
  3. ln -sr test-dir/large-file test-dir/symlink-to-large-file
  4. Run the following (using fs_extra v1.2.0):
    fn main() {
       let size = fs_extra::dir::get_size("test-dir").expect("IO error calculating size");
       println!("size = {size}");
    }

Expected

fs_extra::dir::get_size() returns the same size as ls would, eg:

ls -al test-dir/
total 0
drwxr-xr-x 4 emorley staff     128 Jul  1 11:42 .
drwxr-xr-x 9 emorley staff     288 Jul  1 11:42 ..
-rw-r--r-- 1 emorley staff 1048576 Jul  1 11:41 large-file
lrwxr-xr-x 1 emorley staff      10 Jul  1 11:42 symlink-to-large-file -> large-file

...so 1048576 + 10 (for the symlink) + 128 (for the directory itself) = 1048714

ie something like:

$ cargo run -q
size = 1048714

Actual

fs_extra::dir::get_size() follows the symlink, so counts the size of that file twice. It also doesn't seem to count the size of the directory entry itself.

$ cargo run -q
size = 2097152

Notes

I believe the cause is that fs_extra::dir::get_size() uses std::fs::metadata not std::fs::symlink_metadata, and that the former follows symlinks. Also, use of std::fs::metadata is actually unnecessary since the std::fs::DirEntry returned by std::fs::read_dir already has the metadata we need.

edmorley commented 2 years ago

Opened #60 with a fix.