wasmerio / wasmer

🚀 The leading Wasm Runtime supporting WASIX and WASI
https://wasmer.io
MIT License
18.94k stars 811 forks source link

fs::readdir performance in wasi is not great #4475

Open imsnif opened 8 months ago

imsnif commented 8 months ago

Describe the bug

When reading the contents of a large folder inside the guest machine with eg. fs::readdir, it seems as if wasmer if performing a metadata request for every file: https://docs.rs/wasmer-vfs/3.1.1/src/wasmer_vfs/host_fs.rs.html#80 - which is a bit of a time sink, but the real time sink comes when iterating over files in the Readdir iterator. I could not trace where this is coming from.

This results in (on my machine) wasmer taking 30+ seconds to read a folder with 4K+ files. This is a very big pain point for us, and I could not find a way around this.

I tried implementing my own read_dir using the FileSystem trait and set_fs, but the Readdir timesink is still there (this only gets rid of the relatively negligible metadata timesink I mentioned above).

When implementing my own fs, I also noticed that for some reason the read_dir function is called several times on the same folder - but again, the real time sink is the Readdir one which I could not trace.

wasmer -vV; rustc -vV

rustc 1.75.0 (82e1608df 2023-12-21) binary: rustc commit-hash: 82e1608dfa6e0b5569232559e3d385fea5a93112 commit-date: 2023-12-21 host: x86_64-unknown-linux-gnu release: 1.75.0 LLVM version: 17.0.6

I do not have the wasmer executable installed.

Steps to reproduce

Run a guest wasm blob and read a folder with lots of files, rust example:

    for entry in read_dir(Path::new("/host")).unwrap() {
        eprintln!("looping through entry: {:?}", entry);
    }

Expected behavior

Getting this down to <1s would be super nice.

Actual behavior

This takes quite long.

Additional context

This is a very big blocker for us in Zellij, and since we can't upgrade due to not wanting to adopt wasix, I'm a bit at a loss as to what to do about this. I could not mitigate this behavior by implementing my own fs as mentioned above and would really appreciate recommendations for a workaround. Thanks!!

linear[bot] commented 8 months ago

RUN-103 fs::readdir performance in wasi is not great

syrusakbary commented 8 months ago

Hey @imsnif , we are in the process of planning a refactor of the filesystem atm, which should improve greatly the speed for this case (and many others)

imsnif commented 8 months ago

Thanks @syrusakbary ! Will I be able to upgrade without adopting wasix?

syrusakbary commented 8 months ago

Hey @imsnif,

Will I be able to upgrade without adopting wasix

WASIX is a superset of WASI. One of the things that we can do is to expose only the wasi layer, so even if you use the wasix crate, you don't opt-in into the WASIX specific features (so basically, only WASI programs will be able to be run). Would that work for you @imsnif ?

imsnif commented 8 months ago

Well, as I mentioned elsewhere - when we tried to use wasix in the past, we got a hard-crash when one of the maintainers left their laptop running and it went to sleep mode. I totally get that bugs (even unexpected bugs) happen and are an inherent part of development, but we are very concerned about the added issue-surface this would add to our app.

This is to say that: if the separation is hermetic enough that we can trust it - we'd be happy to consider the upgrade. It would also have to exclude the relevant dependencies - not just at compile time, but also as part of the crates in order not to make extra work for our packagers.

Wasmer has served us very well over the years and we'd be happy to be able to keep using it. It's very much a shame for us that we are stuck on an older version.

theduke commented 8 months ago

@imsnif while the new implementation is still in the planning stages, it will be much more layered and will allow fine-grained opt-in to certain parts, so limiting your compilation to WASI without the WASIX parts will be possible.

imsnif commented 8 months ago

@imsnif while the new implementation is still in the planning stages, it will be much more layered and will allow fine-grained opt-in to certain parts, so limiting your compilation to WASI without the WASIX parts will be possible.

That's great to hear! Just to be clear though so there aren't any misunderstandings: this is not just about compilation but also dependencies. We would not want to adopt a crate with lots of dependencies we're not using (eg. auxiliary dependencies for the networking stack or whatnot).