bytecodealliance / preview2-prototyping

Polyfill adapter for preview1-using wasm modules to call preview2 functions.
Other
77 stars 20 forks source link

fix fd_ operations on directories: keep track of DescriptorType in descriptor table #132

Closed pchickey closed 1 year ago

pchickey commented 1 year ago

For consistency with wasi-common's preview 1 implementation, we need to return an ERRNO_BADF when various fd_ operations are performed on directories. These are captured upstream in the new test dir_fd_op_failures https://github.com/bytecodealliance/wasmtime/pull/6197, which I added to this repository as well.

The error behavior for directories is very much driven by the way wasi-common decided to implement preview 1, where WasiFile and WasiDir are different types inserted into the Table, and fd_seek's implementation is to return an error if the downcast to WasiFile fails.

On the other hand, linux interprets an lseek(2) on a directory to seek through directory entries as exposed by getdents(2), rather than return an error. I don't suggest we change to that behavior - I'm glad our story for readdir has been sorted out in preview 2 - but I do bring this up to point out that this error behavior is pretty arbitrary.

Since the goal of the adapter is for test suite conformance, and I happened to encode this behavior implicitly in the test suite years ago, I'll preserve it now. (I also expanded the test suite to show the behavior more explicitly.)

Now, the adapter will call filesystem::get_type inside each path_open, and record the DescriptorType inside the File struct. We'll inspect that value in the appropriate fd_ operations to return the correct error code: it now gives an error to make a read or write stream from a directory, and Descriptors::get_file now errors on directory as well. A corresponding get_dir has been added to use in the right places.

cc @dicej

dicej commented 1 year ago

Thanks for addressing this @pchickey! I had to make an additional change to match preview 1's behavior:

diff --git a/src/lib.rs b/src/lib.rs
index 2f1970f..32480e2 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -849,6 +849,12 @@ pub unsafe extern "C" fn fd_read(
     State::with(|state| {
         match state.descriptors().get(fd)? {
             Descriptor::Streams(streams) => {
+                if let StreamType::File(file) = &streams.type_ {
+                    if let filesystem::DescriptorType::Directory = file.descriptor_type {
+                        return Err(ERRNO_ISDIR);
+                    }
+                }
+
                 let wasi_stream = streams.get_read_stream()?;

                 let read_len = u64::try_from(len).trapping_unwrap();

For reference, the test case I'm using is https://github.com/fermyon/spin-fileserver/blob/656ae1f9bc4e9128b618741bc3060da36872d5fe/src/lib.rs#L86-L114. I can extract the relevant bits into a test to add to the suite, if that's helpful.

pchickey commented 1 year ago

I decided to write a single test upstream for the present preview 1 behavior, where each time a file-only fd_ operation is called for a dirfd, it returns a ERRNO_BADF: https://github.com/bytecodealliance/wasmtime/pull/6197

I agree with Joel above that returning ERRNO_ISDIR is actually more appropriate, and I am willing to change this behavior upstream and implement it here, but for now my policy has been to conform to what preview 1 does in upstream wasi-common.

dicej commented 1 year ago

@pchickey Thanks for looking deeper into this. I suspect rustix is the one returning ERRNO_ISDIR for Preview 1. Here's an example:

fn main() {
    let mut file = std::fs::File::open("").unwrap();
    let mut buffer = Vec::new();
    std::io::Read::read_to_end(&mut file, &mut buffer).unwrap();
}
 $ cargo build --target=wasm32-wasi
 $ wasmtime run --mapdir /::/tmp ./target/wasm32-wasi/debug/empty_file_name.wasm       
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 31, kind: Uncategorized, message: "Is a directory" }', test-programs/src/bin/empty_file_name.rs:4:56

With preview 2, you instead get:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 29, kind: Uncategorized, message: "I/O error" }', test-programs/src/bin/empty_file_name.rs:4:56
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
pchickey commented 1 year ago

I finally traced out what is going on in preview 1, an fd that is actually a directory has been opened as a WasiFile (because path_open was called without O_DIRECTORY), so instead of going through the Table lookup failure, its the actual execution of WasiFile::read_vectored on the directory, which turns into libc::readv inside cap-std. That, in turn, results in EISDIR.

I'm trying to trace out what is happening on preview 2, but I have broken my tools with my tools. stand by.

pchickey commented 1 year ago

I added the new test and expanded the changes here to cover all of those error cases, and updated the PR description for the expanded scope.