caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
55.45k stars 3.91k forks source link

[file_server] Incorrect file size calculation #6409

Open IgorKha opened 1 week ago

IgorKha commented 1 week ago

hello! I noticed incorrect file size calculation on the web page, and it would be cool if symbolic links could be seen more explicitly

My config:

:80 {
    root * /mnt/local_share/
    file_server browse
}
user@runner:/mnt/local_share/napi-dev/napi-rk3308b-s$ tree --du -h
[627M]  .
├── [  94]  napi-rk3308b-s-latest-dev.img.xz -> /mnt/local_share/napi-dev/napi-rk3308b-s/nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.img.xz
├── [ 101]  napi-rk3308b-s-latest-dev.img.xz.sha256 -> /mnt/local_share/napi-dev/napi-rk3308b-s/nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.img.xz.sha256
├── [  96]  napi-rk3308b-s-latest-dev.manifest -> /mnt/local_share/napi-dev/napi-rk3308b-s/nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.manifest
├── [  91]  napi-rk3308b-s-latest-dev.swu -> /mnt/local_share/napi-dev/napi-rk3308b-s/nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.swu
├── [  98]  napi-rk3308b-s-latest-dev.swu.sha256 -> /mnt/local_share/napi-dev/napi-rk3308b-s/nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.swu.sha256
├── [357M]  nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.img.xz
├── [ 120]  nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.img.xz.sha256
├── [116K]  nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.manifest
├── [270M]  nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.swu
└── [ 117]  nnz-napi-image-dev-napi-rk3308b-s-dev-ba99a1e5.swu.sha256

 627M used in 1 directory, 10 files
Screenshot 2024-06-19 at 19 03 31

Thank you!

mholt commented 1 week ago

Thanks for the report; indeed, that is a bit confusing.

steffenbusch commented 1 week ago

Hmm, but which size should be shown under size for symlinks? Currently, if the symlink target exists, the size of the target is shown and that size is considered for the total size in the header. If the symlink target does not exist, the size of the symlink itself is shown.

I use a custom browse template, which does not reveal that a file is a symlink (that arrow), and therefore I prefer the size of the symlink target. If someone clicks on all files in the directory listing, the downloads combined will be the total size from the header because clicking on a symlink will download the target file. From that perspective, the size, and total size, makes sense. Furthermore, if a symlink has a known extension such as .gif, the symlink-arrow will not be shown in the icon and the user might not understand why the picture "sunset.png" only has 14 bytes.

On the other hand, I understand the confusion.

I would suggest a configurable option, such as use_symlink_size to enable the use of the symlink size instead of the symlink-target size:

    browse [<template_file>] {
        reveal_symlinks
        use_symlink_size
    }
mholt commented 1 week ago

Yeah, good questions.

I'm actually inclined to think that we should show the real size of the symlink rather than what it points to.

As for use_symlink_size, might I suggest a more general subdirective like follow_symlinks? Because the ModTime could also be relevant here. (I am not sure that it necessarily requires reveal_symlinks though.)

IgorKha commented 1 week ago

I think it makes sense for symbolic links to indicate the actual size of the file they point to, but in that case, it's necessary to somehow more clearly indicate that it is a symbolic link. My main point is to correctly calculate the total size; in my example, it shows 1.2GB even though it's actually half that. Unfortunately, I don't quite understand how this works "under the hood," but if I were to imagine, on a web page, I would add some attribute to symbolic links to understand that it is a symbolic link and not include it when calculating the total folder size.

Screenshot 2024-06-20 at 02 09 42
IgorKha commented 1 week ago

I think it would be appropriate for a symbolic link to indicate its actual creation time, as well as the size of the file it points to (with an optional display of the actual size of the link itself, although I’m not sure if that’s generally useful to anyone). To avoid confusion, it is crucial to calculate the actual size of the directory.

Example: if we have a directory with two files, one of which is file.img at 500MB and a symbolic link to it, the directory should be reported as 500MB, not 1GB as it currently is.

francislavoie commented 1 week ago

I think we can just do this?

diff --git a/modules/caddyhttp/fileserver/browsetplcontext.go b/modules/caddyhttp/fileserver/browsetplcontext.go
index 74c14608..47e3f146 100644
--- a/modules/caddyhttp/fileserver/browsetplcontext.go
+++ b/modules/caddyhttp/fileserver/browsetplcontext.go
@@ -80,6 +80,12 @@ func (fsrv *FileServer) directoryListing(ctx context.Context, fileSystem fs.FS,
                }

                size := info.Size()
+
+               // increase the total by the symlink's size, not the target's size.
+               if !isDir {
+                       tplCtx.TotalFileSize += size
+               }
+
                fileIsSymlink := isSymlink(info)
                symlinkPath := ""
                if fileIsSymlink {
@@ -102,10 +108,6 @@ func (fsrv *FileServer) directoryListing(ctx context.Context, fileSystem fs.FS,
                        // was already set above.
                }

-               if !isDir {
-                       tplCtx.TotalFileSize += size
-               }
-
                u := url.URL{Path: "./" + name} // prepend with "./" to fix paths with ':' in the name

                tplCtx.Items = append(tplCtx.Items, fileInfo{

Moving the increment of the total to be earlier, before we follow the symlink.

I don't think we need any config for this, seems obvious to me that the total should always just be the actual physical size of this directory and not include the amounts of linked files.

steffenbusch commented 1 week ago

Why should it show to visitor of a directory listing, that there are two files, both are reported with 500MB each, that the total size is not 1 GB, but 500MB.

Another visitor might be confused that two files of 500MB each are just 500MB in total. Especially for users who don't care how the server operator have provided these files internally - some might not even know what a symbolic link is.

I think that this coin has two sides.

francislavoie commented 1 week ago

How's this?

image

diff --git a/modules/caddyhttp/fileserver/browse.html b/modules/caddyhttp/fileserver/browse.html
index 7b0df1e5..6dece435 100644
--- a/modules/caddyhttp/fileserver/browse.html
+++ b/modules/caddyhttp/fileserver/browse.html
@@ -980,7 +980,7 @@ footer {
                                                        <div class="sizebar">
                                                                <div class="sizebar-bar"></div>
                                                                <div class="sizebar-text">
-                                                                       {{.HumanSize}}
+                                                                       {{if .IsSymlink}}↱ {{end}}{{.HumanSize}}
                                                                </div>
                                                        </div>
                                                </td>
mholt commented 1 week ago

Oh that's a neat idea. I like that.