agora-org / agora

File server that accepts Lightning Network payments for downloads
Creative Commons Zero v1.0 Universal
184 stars 26 forks source link

Create a virtual file system abstraction #273

Closed neunenak closed 2 years ago

neunenak commented 2 years ago

This commit is the first stab at creating a virtual filesystem abstraction. Eventually, vfs.rs is the only place in the codebase that should use APIs that directly access the file system. Other components of the system should go through the VFS to access file data and list the contents of directories, letting the VFS take care of reading the configuration file hierarchy, ignoring hidden files, and similar bookkeeping.

casey commented 2 years ago

@soenkehahn We started working on a VFS abstraction. This just replaces a single call to Config::paid with a call to self.vfs.paid. The basic idea is:

Currently, we have a config for each directory, which governs metadata about all the files in the directory. This seems like the wrong abstraction, and the vfs should let us query metadata for each file individually. Internally, the VFS might read and cache config files, if it's slow to read a config file for each file.

I thought it would probably be too much to do this at the same time as the dynamic content PR, both because that PR is already big, and because I'm pretty sure we want this, independently of the dynamic content PR.

soenkehahn commented 2 years ago

Without having looked at the code, this sounds like a good plan to me!

neunenak commented 2 years ago

I pushed a few more commits - now there is no code in files.rs that is directly touching the filesystem, and check_path can now be made private since no code outside vfs.rs uses it. This might be a good place to cut off a self-contained mergeable commit.

One thing I've noticed is that most of the methods on Vfs accept an InputPath. file_type is the only exception, and it could probably be fairly easily refactored to avoid this. I think we might want a model where Files::serve takes in tail: &[&str], calls some method on Vfs to create an InputPath from it (which would do all error checking), and then works exclusively with InputPath, which is now guaranteed to point to a valid file. The InputPath can basically be an opaque handle to a file as far as the Files code is concerned.

casey commented 2 years ago

I pushed a few more commits - now there is no code in files.rs that is directly touching the filesystem, and check_path can now be made private since no code outside vfs.rs uses it. This might be a good place to cut off a self-contained mergeable commit.

This sounds like a great idea. If you think it's ready, go ahead as mark it as ready for review. I'd start reviewing it now, but I don't want to get ahead of myself in case you have changes or things you'd like to clean up beforehand.

One thing I've noticed is that most of the methods on Vfs accept an InputPath. file_type is the only exception, and it could probably be fairly easily refactored to avoid this. I think we might want a model where Files::serve takes in tail: &[&str], calls some method on Vfs to create an InputPath from it (which would do all error checking), and then works exclusively with InputPath, which is now guaranteed to point to a valid file. The InputPath can basically be an opaque handle to a file as far as the Files code is concerned.

That sounds like a good idea. I think in a follow-up commit, we'll create something like vfs::Path, which is only for looking up files in the VFS. Once we do that, maybe we can get rid of InputPath, or only use it for filesystem paths, and not paths that come in from requests.

Just thinking out loud, this should definitely be a follow-up PR, but vfs::Path can do things like: Be UTF-8 only, reject anything that contains . or .., and always be absolute.

neunenak commented 2 years ago

That sounds like a good idea. I think in a follow-up commit, we'll create something like vfs::Path, which is only for looking up files in the VFS. Once we do that, maybe we can get rid of InputPath, or only use it for filesystem paths, and not paths that come in from requests.

Makes sense. I think if we have vfs::Path we don't need InputPath; or alternatively we could modify InputPath so that it does everything we want vfs::Path to do.

neunenak commented 2 years ago

I'm not entirely sure how notifications on this project are working, pinging you @casey since I updated the PR.

casey commented 2 years ago

Thanks for pinging me! I think I got the notification when you made the changes, but I was behind on responding. @soenkehahn if you have time to review, take a look. There's one thing that's kind of weird, which is that we only do the iter-prefix path check in one function, but that mirrors what we were doing before. I'm kind of inclined to merge it as-is, since all the tests pass.

casey commented 2 years ago

LGTM, btw.

casey commented 2 years ago

The build failed with an error I saw on another PR, which was fixed by, I think, using a different branch with a different cache. I'm seeing if I can fix it by invalidating the cache.

casey commented 2 years ago

I wanted to tinker with something that depends on this, namely a dedicated vfs::Path type, so I addressed the review comments and hit the merge button.