Open RReverser opened 4 years ago
A curious bug!
I don't think your input/output are for the same file, using the input file in the link you gisted above I get:
input
optipng.4e77b.wasm: file format wasm 0x1
Sections:
Type start=0x0000000b end=0x00000101 (size=0x000000f6) count: 33
Import start=0x00000104 end=0x000006d0 (size=0x000005cc) count: 68
Function start=0x000006d3 end=0x00000976 (size=0x000002a3) count: 673
Global start=0x00000978 end=0x0000099c (size=0x00000024) count: 7
Export start=0x0000099f end=0x00000b3d (size=0x0000019e) count: 26
Elem start=0x00000b40 end=0x00000cce (size=0x0000018e) count: 1
Code start=0x00000cd2 end=0x0003c473 (size=0x0003b7a1) count: 673
Data start=0x0003c477 end=0x00046007 (size=0x00009b90) count: 54
output
optipng.out.wasm: file format wasm 0x1
Sections:
Type start=0x0000000e end=0x00000104 (size=0x000000f6) count: 33
Import start=0x0000010a end=0x000006d6 (size=0x000005cc) count: 68
Function start=0x000006dc end=0x0000097f (size=0x000002a3) count: 673
Global start=0x00000985 end=0x000009a9 (size=0x00000024) count: 7
Export start=0x000009af end=0x00000b4d (size=0x0000019e) count: 26
Elem start=0x00000b53 end=0x00000d1d (size=0x000001ca) count: 1
Code start=0x00000d23 end=0x0003d372 (size=0x0003c64f) count: 673
Data start=0x0003d378 end=0x00046f08 (size=0x00009b90) count: 54
The differences here are small differences in the elem and code sections. The elem section seems entirely accounted for because walrus will renumber function indices. The code section I presume is similar. Walrus sorts functions biggest first to get engines with streaming translation the chance to start on the big functions first, and it looks like that happens to cause the size to increase here, presumably because small functions are referenced far more often than large functions.
Walrus isn't really a size-optimizing compiler. It's possible that we could add an entirely new emission mode to optimize for this though.
I don't think your input/output are for the same file, using the input file in the link you gisted above I get:
Sorry, I'm not sure what you meant here. Your output seems to match mine at the first glance? (except I accidentally omitted last line in wasm-objdump for input when copy-pasting).
and it looks like that happens to cause the size to increase here, presumably because small functions are referenced far more often than large functions
Yeah, could be, although surprised it makes such a big difference.
I think size is often important for Wasm users, so adding such mode would be benefitial, but if not, maybe it's worth documenting as explicit non-goal in docs or README?
I also wonder if it would be possible to have this reordering as an optional pass, so that by default functions would appear in original order and roundtrip would be preserved more closely. This would basically delegate order of most functions to the original Wasm compiler, which might choose to optimise for speed or perf as well.
That would require keeping track of the original order, which we don't currently do.
@fitzgen As far as I understand from the code, functions should be already naturally added in original order to the arena during parsing, and that order is preserved throughout most basic transformations.
The actual reordering seems to be happening just here - https://github.com/rustwasm/walrus/blob/35bbfb0c450263046c8481a04c967243351d77de/src/module/functions/mod.rs#L444 - during the emit phase.
Yes the difference is that the input binary you listed above doesn't look like it has a data section, but the data section is there (and is the same size in both)
I don't really think this warrants any notes in the README or anything like that, it's not like we're not size-averse or we specifically don't optimize for size in walrus.
This is a micro-optimization in one use case where resorting functions causes bloat. If you want to 100% optimize for size that's not really what walrus
is built for, and that's why there's tools like wasm-opt -Os
which probably account for this.
doesn't look like it has a data section, but the data section is there (and is the same size in both)
Oh yeah, sorry, as I said, it was just a copy-paste mistake.
I don't really think this warrants any notes in the README or anything like that, it's not like we're not size-averse or we specifically don't optimize for size in walrus.
To me, it's not so much about intentionally optimising for the size, as about roundtrip producing different results. Other tools either usually preserve it by default, or, if they also use a different internal IR - such as Binaryen - clearly document that roundtrip can produce very different Wasm than the input.
I think in this case, as a minimum, it would make sense to follow what Binaryen does with just documenting this constraint. On the other hand, the fix to keep original order (and keep reordering as a separate option) seems so simple that it seems worth just implementing it instead.
I experimented with this a bit, and, while disabling function reordering decreases size a bit closer to original, it seems that the main overhead remains and is caused by locals reordering instead.
I think it's reasonable to document yeah that walrus
will not exactly roundtrip a file when it's read and then serialized, that's not a design goal of this crate.
Describe the Bug
Parsing and re-emitting WebAssembly even without modifications often produces larger binary than the original.
Steps to Reproduce
Expected Behavior
Output is same (or maybe even smaller) than the original.
Actual Behavior
Input file is 286727 bytes long, and output is 290568, making it 3.75KB larger.
Additional Context
wasm-objdump -h
dump for input:and output:
As you can see, most sections remained unchanged, but Elem and Code became larger.
Compression sizes differ as well (even if less so) - e.g. with Brotli input is 103410 bytes and output is 103887 bytes.