xoofx / zio

A cross-platform abstract/virtual filesystem framework with many built-ins filesystems for .NET
BSD 2-Clause "Simplified" License
827 stars 61 forks source link

Copying between two SubFileSystem(PhysicalFileSystem) is slow #90

Closed agocke closed 4 months ago

agocke commented 4 months ago

Basic problem here is that CopyFileCross only has a special case for identical underlying file systems, i.e.

        // If this is the same filesystem, use the file system directly to perform the action
        if (fs == destFileSystem)
        {
            fs.CopyFile(srcPath, destPath, overwrite);
            return;
        }

That's a big speed-up because using FS operations directly is much faster than going through C#. Ideally I think CopyFileCross would also use underlying FS operations if two SubFileSystems/ComposeFileSystems sat on top of the same file system.

That said, I'm not quite sure how to architect this safely. Exposing the underlying file system from SubFileSystem seems dangerous. Open to ideas. The only one I had was to mark ComposeFileSystem.Fallback and ComposeFileSystem.ConvertPathToDelegate as protected internal, and then use the internal access to decompose and compare base file systems.

agocke commented 4 months ago

Actually, one more option. We could add a new set of methods

interface IFileSystem
{
        // Same as existing extension method
        void CopyFileCross(UPath srcPath, IFileSystem destFileSystem, UPath destPath, bool overwrite, bool copyAttributes)

        /// Tries to convert the given srcPath to a path on a given destination file system. Returns
        /// true if successful and destPath contains result. Otherwise, returns false and destPath is undefined
        bool TryConvertPath(UPath srcPath, UPath destFs, out UPath destPath);
}

The implementation for CopyFileCross would have the source file system reduce to the lowest common file system, and then try to convert the target the destPath on the destFileSystem to the given file system. If it succeeds, you can now call CopyFile instead.

xoofx commented 4 months ago

Good question. 🙂

For performance reasons when scanning large filesystem (I think that was for lunet), I introduced some time ago IFileSystem.EnumerateItems which returns a struct FileSystemItem

The idea is that it resolves path to the final filesystem so that when we want to access it, we don't need to go down again the full cascade of filesystems (I have quite a few in lunet).

So I'm thinking that maybe introducing a kind of IFileSystem.ResolvePath (but that would work for a non existing path) that returns the final FS and potential path relative to it could be reused in higher level scenario (e.g CopyFileCross)

xoofx commented 4 months ago

That would give something like this:

var (srcPathResolved, srcFsResolved) = srcFs.ResolvePath(srcPath);
var (dstPathResolved, dstFsResolved) = dstFs.ResolvePath(destPath);
if (srcFsResolved == dstFsResolved) {
 // perform direct copy
}

Unlike ConvertPathToDelegate that returns only a path, it would return a pair (Fs, PathWithinFs)

agocke commented 4 months ago

Sounds good to me. I can PR this if you would like.

xoofx commented 4 months ago

Sounds good to me. I can PR this if you would like.

Sure, if it can help your use case, please do.