Decide whether wasm binsize benchmark should include data

sffc commented 1 year ago

The binsize benchmark shows a regression in the datetime code size between these two dates. It was offline during that time, so it's not entirely clear what the culprit commits were. I imagine that a lot of this is due to calendrical calculations, but we should know for sure so that we can better understand and plan out mitigations.

First step would be to bisect the time between these commits (cfe1f71 and ddb1a7b) and find the commits that most contributed to the spike.

Also, these tests use TypedZonedDateTimeFormatter::<Gregorian>, so in theory they shouldn't be impacted by the calendarical calculation code.

sffc commented 1 year ago

Note: the regressions are in work_log+opt.wasm and tui+opt.wasm

sffc commented 1 year ago

First step is to bisect.

Once bisect is finished, we should add this back to the triage list.

sffc commented 1 year ago

Command line: rm -r wasmpkg; cargo make wasm-opt; ll wasmpkg/wasm-opt/work_log+opt.wasm; ll wasmpkg/wasm-opt/tui+opt.wasm

Some commits failed to build. Some commits around 2023-02 failed with:

error: failed to run custom build command for `sys-info v0.9.1`

Caused by:
  process didn't exit successfully: `/.../icu4x/target/release-opt-size/build/sys-info-62b5cf4053fe346f/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'unsupported system: unknown', /.../.cargo/registry/src/github.com-1ecc6299db9ec823/sys-info-0.9.1/build.rs:38:14

Other commits from 2023-04 to 2023-08, starting with ca28250f7344c9d2e7dcd7ae134a52503d9f50a1 and fixed in c8c91262f29f02bde9fd181af6041a2fc9fa7523, failed with:

   Compiling criterion v0.4.0
error: Rayon cannot be used when targeting wasi32. Try disabling default features.
  --> /.../.cargo/registry/src/github.com-1ecc6299db9ec823/criterion-0.4.0/src/lib.rs:31:1
   |
31 | compile_error!("Rayon cannot be used when targeting wasi32. Try disabling default features.");
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Bisection results from the commits that built out of the box in reverse chronological order:

381010911bfe11a56b49edb55962560e7f0155ca

-rw-r--r-- 1 sffc primarygroup 378730 Oct  9 17:24 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3183827 Oct  9 17:24 wasmpkg/wasm-opt/tui+opt.wasm

13b6b4e9208c3a89724c731f68180ae79423831f

-rw-r--r-- 1 sffc primarygroup 358905 Oct  9 17:40 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3151534 Oct  9 17:40 wasmpkg/wasm-opt/tui+opt.wasm

ea1ba9f3e02cac6bbe01c038268d87379d80a98f

-rw-r--r-- 1 sffc primarygroup 356957 Oct  9 17:57 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3154623 Oct  9 17:57 wasmpkg/wasm-opt/tui+opt.wasm

c8c91262f29f02bde9fd181af6041a2fc9fa7523

-rw-r--r-- 1 sffc primarygroup 366215 Oct  9 18:03 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3165598 Oct  9 18:03 wasmpkg/wasm-opt/tui+opt.wasm

(long period of silence due to bad lockfile)

eb79e5f67a6129430a1a8566c9efa86debb76c78

-rw-r--r-- 1 sffc primarygroup 119364 Oct  9 17:53 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 513517 Oct  9 17:53 wasmpkg/wasm-opt/tui+opt.wasm

f6bfe8bc51dbc1c9efe50bbfd4c5757f92a898ec

-rw-r--r-- 1 sffc primarygroup 119418 Oct  9 17:47 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 520134 Oct  9 17:47 wasmpkg/wasm-opt/tui+opt.wasm

cfe1f7138da5ece82a69e63c35dd73e2213bc659

-rw-r--r-- 1 sffc primarygroup 118988 Oct  9 17:28 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 519593 Oct  9 17:28 wasmpkg/wasm-opt/tui+opt.wasm

So the regression was during the summer, sometime between 2023-04 and 2023-08.

sffc commented 1 year ago

I will try using a proxy metric, the linux example bin size with release-opt-size:

cargo +nightly-2022-12-26 build --profile release-opt-size --examples --workspace --features serde --exclude icu_datagen --exclude icu_capi; ll target/release-opt-size/examples/work_log; ll target/release-opt-size/examples/tui

c8c91262f29f02bde9fd181af6041a2fc9fa7523

-rwxr-xr-x 2 sffc primarygroup 3521488 Oct  9 18:07 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6749624 Oct  9 18:07 target/release-opt-size/examples/tui*

4ed5dbd3b1242315971eccf8a83b11e5a0509ee8

-rwxr-xr-x 2 sffc primarygroup 3529376 Oct  9 18:13 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6754240 Oct  9 18:13 target/release-opt-size/examples/tui*

(in this period, there is an error while building examples; do not know how that could have landed, perhaps a feature resolution issue)

7f10b155cffcd1bf696a0c71bf128b92883357ae

-rwxr-xr-x 2 sffc primarygroup 1960928 Oct  9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2409400 Oct  9 18:20 target/release-opt-size/examples/tui*

004581fef4fc302dda43ac4123270b9070c61541

-rwxr-xr-x 2 sffc primarygroup 1949864 Oct  9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2396512 Oct  9 18:20 target/release-opt-size/examples/tui*

09ad8b7dd898d5023c6eda35cf65e3357b4debc1

-rwxr-xr-x 2 sffc primarygroup 1945776 Oct  9 18:18 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct  9 18:18 target/release-opt-size/examples/tui*

244a4990b329f9b8cd5678bb8affe71ce0238b16

-rwxr-xr-x 2 sffc primarygroup 1945776 Oct  9 18:14 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct  9 18:14 target/release-opt-size/examples/tui*

d8cc2f3c8a289e4c47192af7df7e3daa30454986

-rwxr-xr-x 2 sffc primarygroup 1957376 Oct  9 18:12 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2387232 Oct  9 18:12 target/release-opt-size/examples/tui*

eb79e5f67a6129430a1a8566c9efa86debb76c78

-rwxr-xr-x 2 sffc primarygroup 1963344 Oct  9 18:10 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2398008 Oct  9 18:10 target/release-opt-size/examples/tui*

sffc commented 1 year ago

I did some manual work to get the intermediate broken commits to build. Doing so, I found that for work_log, there are two commits that together account for almost the entire regression:

7e2733dc518265138c9bcc2265bb2a56173d1777 Baked data for icu_datetime (#3591)
7b8ad796026a7901280406a181323516e9dbf126 Add recommended locale set and expand regions in datagen (#3586)

In other words, the examples now use compiled data, whereas before they used buffer provider data from a buffer that was not included as part of the measurement. Mystery solved.

Now, the follow-up question is, do we want to continue measuring the wasm binsize with data included, or do we want to measure it without data? If we measure it with data, we're measuring something that "actually works", whereas if we measure it without data, we can detect regressions in code size at a much more granular level.

robertbastian commented 1 year ago

Ah of course.

If we call these code size benchmarks we should use empty data. We could have a data size benchmark v2 that is the difference between all data and no data.

robertbastian commented 1 year ago

Actually for singleton keys there's no such thing as "empty data", and removing all data will also remove fallback logic (and data). I think including just und could be a good middle ground.

robertbastian commented 1 year ago

Without the region-select option it's tricky to include exactly one data struct.

sffc commented 1 year ago

@sffc to investigate and implement fix if necessary.

unicode-org / icu4x

Decide whether wasm binsize benchmark should include data #4067