charmbracelet / charm

The Charm Tool and Library 🌟
MIT License
2.4k stars 72 forks source link

FR: Add direct Stat() method for File in FS to Charm Cloud server and fix file size discrepancy #291

Open vt128 opened 4 months ago

vt128 commented 4 months ago

Currently, there's no direct method to get file statistics (Stat()) from the Charm Cloud server without downloading the entire file. This is inefficient, especially for large files. Additionally, there's a discrepancy in file sizes reported by different methods.

Current Behavior

  1. Using Open() and Stat() downloads the entire file before returning file stats, which is slow for large files.
  2. ReadDir() method is faster but doesn't provide a direct way to get stats for a specific file.
  3. File sizes returned by ReadDir() and Open()+Stat() are inconsistent and differ from the actual file size.

Desired Behavior

  1. Implement a direct Stat() method that fetches file statistics from the Charm Cloud server without downloading the file.
  2. Ensure consistency in file sizes reported by all methods.

Code Example

Current solution and workaround using ReadDir():

func slowFileStat(lsfs fs.FS, name string) (fs.FileInfo, error) {
    f, err := lsfs.Open(name)
    if err != nil {
        return nil, err
    }
    defer f.Close() // nolint:errcheck
    return f.Stat()
}

func fastFileStat(lsfs *cfs.FS, name string) (fs.FileInfo, error) {
    dn := filepath.Dir(name)
    fn := filepath.Base(name)
    fos, err := lsfs.ReadDir(dn)
    if err != nil {
        return nil, err
    }
    for _, fo := range fos {
        if fo.Name() == fn {
            return fo.Info()
        }
    }
    return nil, os.ErrNotExist
}

Performance Comparison

Testing code:

name = "sub/wtf.mp4"
{
    start := time.Now()
    fi, err := slowFileStat(lsfs, name)
    if err != nil {
        return err
    }
    log.Infow("Read file stat slow", "time_cost", time.Since(start), "path", name, "is_dir", fi.IsDir(), "name", fi.Name(), "size", fi.Size(), "mode", fi.Mode(), "mod_time", fi.ModTime())
}

{
    start := time.Now()
    fi, err := fastFileStat(lsfs, name)
    if err != nil {
        return err
    }
    log.Infow("Read file stat fast", "time_cost", time.Since(start), "path", name, "is_dir", fi.IsDir(), "name", fi.Name(), "size", fi.Size(), "mode", fi.Mode(), "mod_time", fi.ModTime())
}
2024-07-19T20:45:46.518+0800    INFO    trycm/charm.go:406  Read file stat slow {"pid": 13376, "time_cost": "12.397896964s", "path": "sub/wtf.mp4", "is_dir": false, "name": "wtf.mp4", "size": 13069634, "mode": "-rw-r--r--", "mod_time": "2024-07-19T12:07:35.000Z"}
2024-07-19T20:45:46.860+0800    INFO    trycm/charm.go:415  Read file stat fast {"pid": 13376, "time_cost": "338.782084ms", "path": "sub/wtf.mp4", "is_dir": false, "name": "wtf.mp4", "size": 13072989, "mode": "-rw-r--r--", "mod_time": "2024-07-19T12:07:35.325Z"}

File Size Discrepancy

Actual file size (from local file system):

❯ ls -last wtf.mp4
25528 -rw-r--r--@ 1 dev  staff  13069634 Jan 27 10:40 wtf.mp4

The size reported by ReadDir() (13072989 bytes) differs from both the actual file size and the size reported by Open()+Stat() (13069634 bytes).

Proposed Solution

  1. Implement a direct Stat() method in the Charm Cloud server that returns file statistics without downloading the file.
  2. Investigate and fix the file size discrepancy to ensure all methods return the correct file size.

Additional Notes

This enhancement would significantly improve performance for applications that need to frequently check file metadata without downloading the entire file content.