Arlodotexe / OwlCore.Storage

The most flexible file system abstraction, ever. Built in partnership with the UWP Community.
16 stars 4 forks source link

Bulk file operations #62

Open itsWindows11 opened 3 months ago

itsWindows11 commented 3 months ago

Bulk file operations should be standardized in the spec as an optional interface, in which the implementor can efficiently write methods to do item operations like copying and deleting in bulk.

For example if we take the FluentFTP implementation, it has the ability to upload and download files in bulk efficiently. Instead of rendering our abstraction useless in that case, we can use the implementor's underlying capabilities instead of having the consumer deal with each file manually which could be slow and create a lot of network roundtrips.

There are many different implementations for such thing that don't interoperate with each other.

Parallelism when doing bulk operations can also be considered here, but parallelizing single file operations can be done by the consumer manually by opening and writing to the stream, which is mostly outside of this issue's scope.

itsWindows11 commented 3 months ago

A starter interface I'm proposing would be something like:

namespace OwlCore.Storage;

public interface IBulkOperations : IModifiableFolder
{
    Task<IList<IStorable>> BulkCopyAsync(IEnumerable<IStorable> storablesToCopy, ...);

    Task<IList<IStorable>> BulkMoveAsync(IEnumerable<IStorable> storablesToMove, ...);

    // The storables to delete must be in the folder.
    Task BulkDeleteAsync(IEnumerable<IStorable> storablesToDelete, ...);
}

This can be applied to any IModifiableFolder where doing bulk operations is possible and won't cause any potential API problems when interacting with a web API like Google Drive or OneDrive.

Arlodotexe commented 3 months ago

We should tackle recursion a bit more before we attempt this. Creating a fallback extension method (covering when the interface is not implemented) is again the main concern here.

Similar to the problem in https://github.com/Arlodotexe/OwlCore.Storage/issues/35#issuecomment-1505902464, what would a fallback for bulk operations use for parallelism, if any? We'd instead have to pick whatever is best (parallel or sequential) for the underlying API, which is information only held by implementors or consumers of library.

The issues with recursion would also apply here if these are recursive-- There are too many possible ways to crawl a graph for us to simply expose a 'standard' set of options as a parameter.