dispatchrun / timecraft

The WebAssembly Time Machine
https://docs.timecraft.dev
GNU Affero General Public License v3.0
329 stars 6 forks source link

sandbox: add overlay file system #222

Closed achille-roussel closed 1 year ago

achille-roussel commented 1 year ago

This PR adds a new implementation of the sandbox.FileSystem interface, which implements a behavior similar to the Linux overlay file system: combining two file system layers, a lower read layer, and an upper write layer. The lower layer defines the initial state of the file system, while the upper layer serves all writes. When a write happens, the data is lazily moved from the lower to the upper layer to serve the writes.

A significant challenge of implementing such a file system is managing permissions on directories because read-only or non-searchable directories copied to the write layer will prevent migrating files from the read layer (if they are read-only, no files can be created). I left this out for now since we don't have a use case for it in WASI (there are no Unix-style permissions), and documented this reasoning in the code.

Another challenge is removing files migrated from the read to the write layer. Simply deleting the file entry on the write layer would cause the file to be reset to its original state since the file in the read layer would become visible again. For this reason, we use the same strategy we do for the OCI layers and create a whiteout file instead of deleting the directory entry so it keeps masking the content of the read layer.

Lastly, the current implementation doesn't scale when writing to very large files that exist on the read layer because we copy the entire file to the write layer before serving the write. I don't know of any use cases for this, it should be rarely needed. Still, if we ever needed to, we could improve the implementation using sparse files on the upper layer to move only areas of the file that have been touched (note that some file systems will already optimize copy_file_range, which we use to migrate files to the write layer, in which case the copy isn't much of a concern).

achille-roussel commented 1 year ago

Directory manipulations are a huge pain, and issues cannot be addressed with the current approach (e.g., some mutations like moving or linking files cannot be atomic). From what I'm seeing now, the solution is likely to maintain the file system structure in memory, where we can apply the necessary synchronization to make things transactional, but keeping a writable file system in memory is risky from a security standpoint (high exposure to malicious code causing memory exhaustion with file system mutations). Also, it can have a significant performance impact when creating a process since we need to develop large in-memory data structures to represent the view of the file system. An alternative approach would be implementing a user-space file system, but that's also a huge undertaking. One last option could be to figure out a hierarchical locking mechanism to make operations atomic while maintaining the file system state on disk. However, that's also highly error-prone; I won't get that right within my available time.

achille-roussel commented 1 year ago

Closing in favor of https://github.com/stealthrocket/timecraft/pull/227