Open badeend opened 1 year ago
I'm up for exploring this idea more sometime in the future, but for the preview2 timeframe I have advocated that we keep interfaces and concepts that existed in preview 1 as similar as we can possibly get away with.
I think this is a good idea. This would solve https://github.com/WebAssembly/wasi-libc/issues/414
for the preview2 timeframe I have advocated that we keep interfaces and concepts that existed in preview 1 as similar as we can possibly get away with.
Yeah, that's totally fine!
The main reason why I am asking this is because I'm trying to get a grip on where WASI wants to end up regarding "preopens", so I can take that into account for wasi-sockets.
As the person who initially brought preopens to WASI, based on similar ideas in CloudABI and Capsicum, I agree. I've also come to believe that migrating wasi-filesystem to have a rooted filesystem view would be beneficial to WASI. This would be from many people's perspective a much more "normal" filesystem API. We could have an open
function where you can just pass it a string, and open a file. We could even add a builtin current working directory. Overall, this would eliminate a common source of friction when people look to port code to WASI.
This new namespace could still be expected to be sandboxed, but the sandbox could look more like a simple VFS with things mounted in it. Many use cases could just use that directly, rather than using preopens. I expect the VFS could even be populated by the existing --dir
command-line flags. Overall, it seems like this is something we can migrate to incrementally, without breaking compatibility with existing code.
Why the change in stance? Preopens were added in the Preview 1 era, at a time when there wasn't even a component model proposal, and we had a lot of questions with no answers. But today:
(But what if you want an Everything Is A File experience? We can provide it, by teaching WASI-Virt how to implement features in terms of wasi-filesystem APIs!)
So, should we start moving in this direction? I don't see any objections to this overall issue so far, but I'll do some more asking around.
This new namespace could still be expected to be sandboxed, but the sandbox could look more like a simple VFS with things mounted in it.
i concern the "simple VFS" is likely a bit more expensive than preopens, which is just a list.
Dan published a blog post about this today: Capabilities and Filesystems. In it he says the following about wasi:filesystem
:
We could add a plain
open
function, which takes a path argument and resolves it relative to a filesystem root that comes with the instance that the function is imported from. That makes thisopen
function use a link-time capability. It should support absolute paths, symlinks to absolute paths, and all the rest. And then, we could add similar non--at
versions of all the-at
functions. This new API could live in a new interface that could coexist with the current interface.
To me this seems like something worth experimenting with now that we've shipped Preview 2.
I've created a proof-of-concept of this over at: https://github.com/bytecodealliance/wasmtime/compare/main...badeend:root-directory
We could add a plain
open
function, which takes a path argument and resolves it relative to a filesystem root that comes with the instance that the function is imported from. That makes thisopen
function use a link-time capability. It should support absolute paths, symlinks to absolute paths, and all the rest. And then, we could add similar non--at
versions of all the-at
functions. This new API could live in a new interface that could coexist with the current interface.
I went with a simpler design, similar to the one outlined in the initial post. Instead of duplicating the entire path-based API-surface into non-_at
versions, the POC only has one new WIT method:
package wasi:filesystem@0.2.0;
interface preopens {
+ @deprecated
get-directories: func() -> list<tuple<descriptor, string>>;
+ /// Get the root directory of the file system.
+ /// In POSIX filesystems this is located at: `/`
+ @unstable
+ root-directory: func() -> descriptor;
}
This feels like a sweet spot to me:
root_directory.open_at()
.open(...)
== rootdir.open_at(...)
, and: symlink(...)
== rootdir.symlink_at(...)
, etc.To test it all out I disabled the preview1-style preopens and every unit test still passes, except preview2_file_read_write
. This is quite obvious when you look at its source code:
let preopens = wasi::filesystem::preopens::get_directories();
let (dir, _) = &preopens[0]; // <--- panics here
All other tests continue work because they go through the preview1-to-preview2 adapters, which I also modified:
On initialization, the adapter calls root-directory()
and appends the fd to its internal preopens list with a hardcoded path of: /
Next, it attempts to get the initial-cwd
and open that using root_directory.open_at(initial_cwd)
. If that succeeds, the fd is likewise appended to the preopens at path: .
This was enough to make the preview1 test suite pass without modification.
In terms of upgrade path; a future preview2-to-preview3 adapter could take a similar approach and return exactly 1 or 2 paths from its synthesized preopens::get-directories()
. One for /
, and optionally another one for .
Note: eagerly preopening initial_cwd
as .
was the easiest way to get the adapter to work. Don't think this is a fundamental requirement, though.
Some notes about the POC implementation:
initial_cwd
setting. This is exposed by wasi:cli/environment::initial-cwd()
.WasiCtxBuilder::preopened_dir
are converted to absolute paths based on that new initial_cwd
setting.crates/wasi/src/host/filesystem.rs
to crates/wasi/src/filesystem.rs
. So the diff is on those files is not really helpful.The POC also fixes https://github.com/WebAssembly/wasi-libc/issues/414, btw
@badeend thanks for doing this work! - Would you be up for presenting these changes at either the WASI sub-group or the component implementers meeting? It doesn't have to be anything big - but more to get all implementers on the same page about this.
I went with a simpler design, similar to the one outlined in the initial post. Instead of duplicating the entire path-based API-surface into non-
_at
versions, the POC only has one new WIT method
Oh also, I was curious about this: for 0.3, do you think we should keep the at_
APIs? Or is this change specific to 0.2.x, and we should actually remove the at_
APIs in 0.3?
do you think we should keep the at APIs? Or is this change specific to 0.2.x, and we should actually remove the at APIs in 0.3?
Even for 0.3+, I think there's still merit in keeping them. Two reasons I can think of right now:
*_at
methods are securely confined to the reference directory descriptor. This on its own is already a nice guarantee to build software on. Additionally, this simplifies the composition of components with increasingly tightened sandboxes; A parent component can open-at
a specific subdirectory and expose that descriptor as-is, without further virtualization, to a child component as their root-directory
.*_at
API's and replace them with only ambient authority equivalents, this would significantly increase the complexity of the preview1 adapters as they would need to rebuild the security model that exists today.A subtlety here is that the current *-at
functions perform this sandboxing where they prevent ".." and symlinks from escaping the given directory. In contrast, POSIX openat
doesn't do this sandboxing.
That's correct. The current WASI open-at
is most similar to Linux' openat2
with RESOLVE_BENEATH
.
To get POSIX openat
behavior with my POC, is for wasi-libc to prepend the path of the reference directory to the input path and use that to call open-at
on the root directory, correct?
In a POSIX sense, prepending the path isn't sufficient, because according to POSIX, the main purpose of openat
is to avoid race conditions:
The purpose of the openat() function is to enable opening files in directories other than the current working directory without exposure to race conditions. Any part of the path of a file could be changed in parallel to a call to open(), resulting in unspecified behavior. By opening a file descriptor for the target directory and using the openat() function it can be guaranteed that the opened file is located relative to the desired directory.
Ah right. The devil is always in the details. Do you have any ideas on how to go about this in such a way that it can be implemented efficiently but also doesn't require exposing the entire host filesystem?
Do we actually need to support openat
without RESOLVE_BENEATH
?
Do we actually need to support
openat
withoutRESOLVE_BENEATH
?
That's a good question. I've been thinking part of the goal here is to have as "normal" a filesystem API as we can get, which would include having an openat
that works like POSIX's does. But I could also see the argument that this particular detail of it isn't as important as keeping the overall API surface simpler.
Given WASI's CloudABI heritage I understand why preopens currently are the way that they are.
But they feel like something that should be just an host implementation detail. Ie. could the following also work?:
Remove
in exchange for:
which returns a single directory that represents the "default filesystem". In similar vein to:
By default this is an empty directory. But the host can "mount" specific external directories/files into it. Exactly how the contents of this "filesystem" are populated (preopened or not), would be of concern only to the host, not the client application / libc.