Closed Shnatsel closed 4 months ago
It seems that ahash
is only used when the std
feature is disabled: https://github.com/bytecodealliance/wasm-tools/blob/df542a839b1fc6c1345c68abf5da21ca9ef5f9fe/crates/wasmparser/src/map.rs#L112-L126
Something also enables the ahash
feature in hashbrown
in the final Cargo.toml
uploaded to crates.io, although I'm not sure what does that exactly:
[dependencies.hashbrown]
version = "0.14.3"
features = ["ahash"]
default-features = false
You can inspect the Cargo.toml
uploaded to crates.io like this:
cargo install cargo-dl
cargo dl -x wasmparser
cat wasmparser/Cargo.toml
Thanks for the report! Out of curiosity, have you noticed a measurable size increase with this support? Or otherwise are you mostly worried about the new deps?
Ideally this would be a no_std
-only dependency but unfortunately Cargo doesn't have great support for that I think (or at least not that I know of). Otherwise the next-best-thing would probably be to work with ahash
upstream to slim the set of dependencies. The ahash
dependency isn't just used by wasmparser
directly but also through hashbrown
where the dependency is unconditional too.
Thank you for the quick response!
Out of curiosity, have you noticed a measurable size increase with this support?
My use case is very basic, and it seems ahash and all associated code just gets removed by dead code elimination. So now that I've measured it, the binary size overhead doesn't seem to be significant (when using only a small subset of wasmparser
, anyway).
Or otherwise are you mostly worried about the new deps?
Yes. I want to make my code very easy to adopt for organizations, which often involves a security review of all dependencies. Until I added wasmparser
, the dependency tree was very small and 100% safe Rust all the way down (except for std
). I even rolled my own parser for ELF/PE/Mach-O so that I could guarantee binary parsing with no unsafe code, heap allocations or panics.
Now that wasmparser
added large crates with unsafe
code to the tree, adopting my project will be a lot more expensive.
The
ahash
dependency isn't just used bywasmparser
directly but also throughhashbrown
where the dependency is unconditional too.
ahash
is an optional dependency for Hashbrown, see https://github.com/rust-lang/hashbrown/blob/835936584e33480f6afbf1ae9be7151c1b27d595/Cargo.toml#L17
unfortunately Cargo doesn't have great support for that I think (or at least not that I know of)
Indeed: https://github.com/rust-lang/cargo/issues/1839
I think our best bet is to have std
and no_std
features and require the user to choose one. That would be a semver-breaking change, but given the v0.206.0 version number it doesn't look like frequent semver breaks are a concern.
Ok thanks for confirming on the binary size piece, that confirms my gut feeling here too. Also definitely understood about the dependency tree, that's a concern for us in Wasmtime as well!
Having to choose between std/no_std won't really compose well I think (e.g. Cargo's feature support doesn't support mutally exclusive features), but let me perhaps ask another question for this. Does your use case involve validating a WebAssembly module or just parsing it? If it's just parsing then I'm pretty sure we could easily put all the hash-map-bits and such behind a new "validate" feature or similar. That means you'd still be able to parse but wasmparser wouldn't be able to validate though.
If that'd work for your use case then that might be the best way forward?
Yes, that would be excellent! I do not need validation at all, I just need to parse the high-level structure and extract specific sections. Being able to just disable the validation feature with its hashmaps and other heap allocations would be perfect.
Ok sounds good, I'll work on splitting that out in the near future. And also, yes, we don't currently have semver-compatible releases, so this'll be a 207.0.0 version.
Thanks a lot!
And yes, semver is a non-issue. I have anticipated that and didn't expose wasmparser
types publicly in my libraries.
v0.207 with this change has been released, and I've upgraded my tools to it: https://github.com/rust-secure-code/cargo-auditable/pull/150
It works great for my use case. Thanks a lot!
The
ahash
dependency ofwasmparser
crate pulls in a considerable amount of dependencies. Here's an excerpt fromcargo tree
to illustrate it:This adds binary size overhead and introduces more unsafe code into the build.
I'd like to have the option to remove
ahash
and its dependencies from the dependency tree for use in high-assurance environments.