ctm / mb2-doc

Mb2, poker software
https://devctm.com
7 stars 2 forks source link

Investigate WASM size #549

Closed ctm closed 2 years ago

ctm commented 3 years ago

See if there's anything that can make a dramatic reduction in Mb2's WASM size.

Mb2's WASM size is now 10 MB which gets gzipped down to 1.3 MB. However, it appears that most of that size is due to the monomorphization of serde_cbor:

[master]% twiggy top *.wasm > /tmp/t

produces (non-serde_cbor lines snipped):

 Shallow Bytes │ Shallow % │ Item
───────────────┼───────────┼────────────────────────────────────────────────────────────────
         33197 ┊     0.28% ┊ serde_cbor::de::Deserializer<R>::parse_value::h00b89f9bd05117aa
         23495 ┊     0.20% ┊ serde_cbor::de::Deserializer<R>::parse_value::h135739f65096c952
         22517 ┊     0.19% ┊ serde_cbor::de::Deserializer<R>::parse_value::h281d1a56f6779899
         20478 ┊     0.17% ┊ serde_cbor::de::Deserializer<R>::parse_value::h44e5fdb70659823e
         20428 ┊     0.17% ┊ serde_cbor::de::Deserializer<R>::parse_value::h17470e02e2f35752
         20026 ┊     0.17% ┊ serde_cbor::de::Deserializer<R>::parse_value::h113e867f598d20a5
         20024 ┊     0.17% ┊ serde_cbor::de::Deserializer<R>::parse_value::h451dc71f4057c09e
         19777 ┊     0.17% ┊ serde_cbor::de::Deserializer<R>::parse_value::h064408d9b92791db
         19567 ┊     0.16% ┊ serde_cbor::de::Deserializer<R>::parse_value::hc255922e2ef73d95
...

And there's a lot of it:

[master]% grep 'serde_cbor::' /tmp/t > /tmp/cb
[master]% grep ::Deserial /tmp/cb > /tmp/de
[master]% wc -l /tmp/cb /tmp/de
    2413 /tmp/cb
    2139 /tmp/de
    4552 total
[master]% awk '{s+=$1}END{print s}' /tmp/cb
6687826
[master]% awk '{s+=$1}END{print s}' /tmp/de
6540596

Between web searching, playing around and asking knowledgable people, my guess is I can shrink the vast majority of the deserialization code, although it may require switching away from serde_cbor.

I'm not marking this as high priority yet, nor am I attaching it to any particular milestone, but I don't want to forget about this, so I'm going to (perhaps misleadingly) tag it as easy at least until I ask around and am told that it's not. I.e., I'm assuming that once I get around to asking, someone will point me to an easy solution.

ctm commented 3 years ago

After skimming an article that compares serializers, it looks like MessagePack is worth looking into, although I think the bloat comes from the way serde does things rather than anything cbor_serde specific, so switching to MessagePack and still using serde is likely to have the same bloat issue. OTOH. some of the serde alternatives, like miniserde only serialize to JSON which results in way too many bytes being sent over the wire.

ctm commented 2 years ago

I was looking at this issue earlier today and once again I ran into the serializer benchmark. However, I don't think that benchmark's use of cbor is using the settings mb2 uses to get really small packets. Furthermore, it's not clear which (if any) of the alternatives can avoid the code bloat. So …

One possibility would be for me to analyze all the traffic that mb2 has sent and just look for the packets that make up the bulk, then I could hand-craft the important packets and bolt on something else (e.g., miniserde, serde_lite) for the rest.

This still isn't high priority, but it's something I think about fairly often whether I want to or not!

ctm commented 2 years ago

Not easy.

ctm commented 2 years ago

FWIW, switching to postcard (#991) dropped the wasm size from 9.39 MiB to 3.95 MiB, using the options we currently build with. We can get that a little smaller by changing some options. The top-level Cargo.toml says:

[profile.release]
# If we want the smallest, fastest release, we want lto to be true, opt-level
# to be 'z', debug to be false and overflow-checks to be false.  There's also
# some wasm-opt stuff to look at in deploy.
# lto = true
# opt-level = 'z'
debug = true
overflow-checks = true

I don't feel comfortable turning those on right now, but with lto true, opt-level 'z', debug false and overflow-checks false, we get a 3.23 MiB wasm file: -rw-r--r-- 1 ctm staff 3387283 Jun 24 07:48 ./pkg/index_bg.wasm which we can make slightly smaller with

wasm-opt ./pkg/index_bg.wasm -o ./pkg/index_bg.wasm -Oz --strip-debug --strip-producers --vacuum

-rw-r--r-- 1 ctm staff 3356469 Jun 24 07:51 ./pkg/index_bg.wasm

FWIW, twiggy monos now produces:

             123156 ┊          0.72% ┊   125118 ┊  0.73% ┊ <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed
                    ┊                ┊     1962 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::h1bdbc5ba47f474d0
                    ┊                ┊     1962 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::ha836401f9ea94d78
                    ┊                ┊     1751 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::h26129afaebace84f
                    ┊                ┊     1541 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::hcd73950c5dcdceac
                    ┊                ┊     1541 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::heb31c825b338be63
                    ┊                ┊     1535 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::h8f13951609e1cee6
                    ┊                ┊     1535 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::h94bad777ce21f053
                    ┊                ┊     1535 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::hdac7e927e956babd
                    ┊                ┊     1330 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::hbb2496fc4c8dfe35
                    ┊                ┊     1129 ┊  0.01% ┊     <postcard::de::deserializer::SeqAccess<F> as serde::de::SeqAccess>::next_element_seed::h14cc714b1fc369c1
                    ┊                ┊   109297 ┊  0.64% ┊     ... and 153 more.

The 123,156 in the first column is the approximate bloat, which I believe is essentially the amount of bytes that could be saved in a hypothetically non-monomorphized implementation.

I don't think there's anything more worth doing on this issue, so I'm closing it.