manuel-woelker / rust-vfs

A virtual filesystem for Rust
Apache License 2.0
384 stars 44 forks source link

Q: Why does VfsPath have a ref to the FS? #27

Open aleokdev opened 2 years ago

aleokdev commented 2 years ago

Hello! I'm one of the core maintainers of the rs-tiled crate, and have been considering this crate for a while to solve a few issues we're having with source paths and accessing virtual Tiled files in virtual filesystems.

However, I have a few questions about the design of the crate, the biggest of which being the nature of VfsPath. Unlike the path types present in std::path, they have an atomic refcount for the FS they are referring to. I don't see why this is: From my point of view, it's inconsistent with the standard types, slower than you would expect and unneccesary, since you could just pass a reference to the VFS the path is from. What am I missing here?

BTW: I can see that this crate has a lot of effort put into it and I'd love to see it reach 1.0.0! :) If necessary, I'd be able to help it reach this milestone by enforcing the Rust API Guidelines.

manuel-woelker commented 2 years ago

Hi there, thanks for the interest!

The idea is that the VFS is encapsulated in the VfsPath construct itself. This is done to improve ergonomics when dealing with paths. You only need a single object (the VfsPath) instead of a tuple of (path, VFS). It also prevents mixing up paths from unrelated filesystems. There is some overhead due to the contained Arc, cloning and the boxing of the actual filesystem, but in most use cases it should be negligible compared to the actual IO. For file-intensive applications (think something like rsync) this might not be the best abstraction performancewise.

If the overhead is indeed too large, another idea might to use the "raw" FileSystem abstraction, maybe in conjunction with camino for path manipulation. This way unnecessary clones and refcounts can be avoided.

aleokdev commented 2 years ago

I see. However, another thing that isn't quite clear to me is why are dynamic trait references so common in the crate, particularly here, in VfsPath. VFS could be switched by a T where T: FileSystem, which would improve runtime performance and simplify internals with minimal compiletime overhead. I get that runtime performance might not be your goal here, but this is a free abstraction done by the compiler that you could be getting advantage of, in my opinion.

manuel-woelker commented 2 years ago

The original implementation used type parameters like you described. However having the FileSystem Type and associated types generic made handling these types quite cumbersome, especially for combined FileSystems like OverlayFS. Functions performing IO needed to be generic over T which lead to T where T: FileSystem everywhere.

That's why I opted to keep the types simpler, at the cost of some refcounting and boxing. I assume that most operations perform a syscall and or disk IO, which should make this overhead negligible.

On thing I have been pondering is making the VfsPath generic, but default the type to `Box. This should allow an optimized code path, while still allowing an alternate, slower but more ergonomic use.