rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.16k stars 318 forks source link

Windows file system shims are missing #3482

Open RalfJung opened 2 months ago

RalfJung commented 2 months ago

Miri supports accessing the file system (opening, reading, writing files; listing directories) on Unix systems, but not on Windows. This is probably the biggest remaining gap in our Windows support.

If you run into this, try cargo miri test --target x86_64-unknown-linux-gnu. This works even on a Windows host (without installing anything extra) and interprets programs on a well-supported Linux target -- file system access should generally work then. If it doesn't please file an issue (do not comment on this issue, instead file a new one).

Nobody on the Miri team is very knowledgeable in Windows APIs, and a reasonable work-around exists for Windows users, so this is unlikely to be on our agenda any time soon. Contributions are always welcome though. :)

mbyt commented 2 months ago

@RalfJung thanks for the fast response in #1537 and creating this ticket (I'll subscribe to it). Also thanks for pointing out that Miri can run with target linux on windows. I was not aware of that.

Unfortunately, this workaround is not possible for me, as I am depending on a large external C dependency which is wrapped via bindgen. Unfortunately, this external C dependency has no linux support, thus I cannot cross-compile for linux.

RalfJung commented 2 months ago

Miri does not work with external C dependencies anyway, so that means it won't work for you on Windows either.

mbyt commented 2 months ago

Thanks again for your fast response.

What I currently do, is that I run the complete unittest suite on windows and skip the tests which (i) access the filesystem or (ii) a foreign function. This works actually quite well and thanks to miri I already found a few places of undefined behavior. That's marvelous!

With this ticket I could also run the tests which access the filesystem and would only need to skip the tests with the foreign function.

Regarding external C dependencies: Is this the topic of #2365?

RalfJung commented 2 months ago

Okay, so the problem is getting the build scripts to shut up and do nothing so that you can use --target x86_64-unknown-linux-gnu. On Windows the external dependencies will be built and linked even in Miri, which is useless but harmless. With a cross-target then it is not harmless any more, it causes the build to fail. I can see how disabling those build scripts could become tricky. Might still be simpler than implementing the Windows FS shims though. ;)

Regarding external C dependencies: Is this the topic of https://github.com/rust-lang/miri/issues/2365?

Yes.

This works actually quite well and thanks to miri I already found a few places of undefined behavior. That's marvelous!

Awesome. :)

If these bugs have a public track record / issue / PR somewhere, we always appreciate PRs that add more bugs to the "bugs found with Miri" list in the readme. :D

mbyt commented 2 months ago

Thanks again for your fast response and help. This is really a great project!

If these bugs have a public track record / issue / PR somewhere, we always appreciate PRs that add more bugs to the "bugs found with Miri" list in the readme. :D

Unfortunately the issues are in the closed source components, thus I cannot share. However, if interesting for you here the error messages of the discovered errors:

cgettys-microsoft commented 1 week ago

I'm also interested in this and might be willing to take a crack at adding support. @RalfJung, maybe you could clarify what the main challenge to doing this? I'm not asking you guys to do it, just seeking to understand what I'd be getting myself into.

Is it cross-emulation? E.g. emulating the behavior of Windows FS apis on Linux? Is it having an in-depth enough knowledge of the Windows APIs involved to be able to well specify what would be UB when using them or when operating on their results (e.g. what return codes tell you that a buffer you passed into the function is now initialized memory, or how much of the buffer is initialized)? Is it sheer volume of the number of APIs that need to be shimmed?

Naively, it seems like -Zmiri-disable-isolation support wouldn't be all that tricky for someone with a decent degree of Windows API knowledge and enough time on their hands - many, many shims to add to src\shims\windows\foreign_items.rs? But I'm pretty sure I'm missing something, even reading through the source and trying to understand what the existing unix fs handling is doing.

saethlin commented 1 week ago

Naively, it seems like -Zmiri-disable-isolation support wouldn't be all that tricky for someone with a decent degree of Windows API knowledge and enough time on their hands - many, many shims to add to src\shims\windows\foreign_items.rs?

Both Windows API knowledge and free time is in very short supply around here.

RalfJung commented 1 week ago

It's mostly about knowing the windows APIs involved so that all the behavior of these functions, and the UB checks, can be implemented, ideally using only the Rust std APIs: for Unix we are implementing open, read, write, close, readdir etc on top of std::fs::File and friends. The same would be the goal for the corresponding Windows functions.

We don't aim to implement the entire gigantic API surface of Windows, just the parts that std calls. I hope that would not be that many functions? For functions that expose a large API via opcodes/flags, but std uses only a fraction of that, we also usually implement only that fraction, at least in the first iteration.

So as Ben said, Windows API knowledge and time are the main things missing. We can help with the required Miri knowledge :)

cgettys-microsoft commented 1 week ago

I wouldn't personally call myself a Windows API expert, but I definitely know folks who have expertise. But no promises as is often the case RE time :(.

Even the fraction that std uses is pretty comprehensive. Ex: WriteFile vs NtWriteFile, and ReadFileEx vs NtReadFile - looks like pipes for std::process use one and std::fs used the other. And so on.

It should be possible to do in terms of Rust std:: apis, I think anyway.

Here's some Microsoft docs that are relevant and publicly available (in case anyone finds them interesting / in case I don't have a chance to attack this). https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/asplos2011-drawbridge.pdf https://www.microsoft.com/en-us/sql-server/blog/2016/12/16/sql-server-on-linux-how-introduction/ Clearly the WSL guys have managed bi-directional interop, not just one direction. Driver called DrvFs handles at least one direction? https://learn.microsoft.com/en-us/windows/wsl/wsl-config But none of these go into too much detail / at the API call level, nor are any of these OSS as far as I know.

saethlin commented 1 week ago

The std::process APIs are not very interesting for Miri yet. Not only is spawning other processes unsupported on Linux because we would need many more shims, it's not even clear how that would be implemented. Would we spawn a new interpreter? IPC would get weird.

cgettys-microsoft commented 1 week ago

Ah, yeah, good point. Nor is std::process something I personally need.

Yeah, if it's mostly sync I/O without the fancy stuff (APC or IOCP or similar), maybe it's not too bad. Still a significant number, but tractable, with this presumably being most of the list. https://github.com/rust-lang/rust/blob/c1b336cb6b491b3be02cd821774f03af4992f413/library/std/src/sys/pal/windows/fs.rs#L1064.

ChrisDenton commented 1 week ago

We do have some odd bits in the standard library. For example, we need to guard synchronous read/write functions against asynchronous file handles, which are safe to create. And maybe don't even look at remove_dir_all until last, ha.

But if you do have any questions about the standard library then feel free to ping me.

beepster4096 commented 1 week ago

And maybe don't even look at remove_dir_all until last, ha.

Maybe don't ever look at it. On Unix, std uses the fallback impl of it when running in miri.

RalfJung commented 1 week ago

On Unix, std uses the fallback impl of it when running in miri.

I hope we'll get https://github.com/rust-lang/rust/issues/120426 at some point and then we can implement more POSIX APIs in Miri on top of that and start using the real impl in Miri... but anyway, that's off-topic here. :)