Open sffc opened 1 year ago
Note: the regressions are in work_log+opt.wasm
and tui+opt.wasm
First step is to bisect.
Once bisect is finished, we should add this back to the triage list.
Command line: rm -r wasmpkg; cargo make wasm-opt; ll wasmpkg/wasm-opt/work_log+opt.wasm; ll wasmpkg/wasm-opt/tui+opt.wasm
Some commits failed to build. Some commits around 2023-02 failed with:
error: failed to run custom build command for `sys-info v0.9.1`
Caused by:
process didn't exit successfully: `/.../icu4x/target/release-opt-size/build/sys-info-62b5cf4053fe346f/build-script-build` (exit status: 101)
--- stderr
thread 'main' panicked at 'unsupported system: unknown', /.../.cargo/registry/src/github.com-1ecc6299db9ec823/sys-info-0.9.1/build.rs:38:14
Other commits from 2023-04 to 2023-08, starting with ca28250f7344c9d2e7dcd7ae134a52503d9f50a1 and fixed in c8c91262f29f02bde9fd181af6041a2fc9fa7523, failed with:
Compiling criterion v0.4.0
error: Rayon cannot be used when targeting wasi32. Try disabling default features.
--> /.../.cargo/registry/src/github.com-1ecc6299db9ec823/criterion-0.4.0/src/lib.rs:31:1
|
31 | compile_error!("Rayon cannot be used when targeting wasi32. Try disabling default features.");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Bisection results from the commits that built out of the box in reverse chronological order:
381010911bfe11a56b49edb55962560e7f0155ca
-rw-r--r-- 1 sffc primarygroup 378730 Oct 9 17:24 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3183827 Oct 9 17:24 wasmpkg/wasm-opt/tui+opt.wasm
13b6b4e9208c3a89724c731f68180ae79423831f
-rw-r--r-- 1 sffc primarygroup 358905 Oct 9 17:40 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3151534 Oct 9 17:40 wasmpkg/wasm-opt/tui+opt.wasm
ea1ba9f3e02cac6bbe01c038268d87379d80a98f
-rw-r--r-- 1 sffc primarygroup 356957 Oct 9 17:57 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3154623 Oct 9 17:57 wasmpkg/wasm-opt/tui+opt.wasm
c8c91262f29f02bde9fd181af6041a2fc9fa7523
-rw-r--r-- 1 sffc primarygroup 366215 Oct 9 18:03 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3165598 Oct 9 18:03 wasmpkg/wasm-opt/tui+opt.wasm
(long period of silence due to bad lockfile)
eb79e5f67a6129430a1a8566c9efa86debb76c78
-rw-r--r-- 1 sffc primarygroup 119364 Oct 9 17:53 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 513517 Oct 9 17:53 wasmpkg/wasm-opt/tui+opt.wasm
f6bfe8bc51dbc1c9efe50bbfd4c5757f92a898ec
-rw-r--r-- 1 sffc primarygroup 119418 Oct 9 17:47 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 520134 Oct 9 17:47 wasmpkg/wasm-opt/tui+opt.wasm
cfe1f7138da5ece82a69e63c35dd73e2213bc659
-rw-r--r-- 1 sffc primarygroup 118988 Oct 9 17:28 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 519593 Oct 9 17:28 wasmpkg/wasm-opt/tui+opt.wasm
So the regression was during the summer, sometime between 2023-04 and 2023-08.
I will try using a proxy metric, the linux example bin size with release-opt-size:
cargo +nightly-2022-12-26 build --profile release-opt-size --examples --workspace --features serde --exclude icu_datagen --exclude icu_capi; ll target/release-opt-size/examples/work_log; ll target/release-opt-size/examples/tui
c8c91262f29f02bde9fd181af6041a2fc9fa7523
-rwxr-xr-x 2 sffc primarygroup 3521488 Oct 9 18:07 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6749624 Oct 9 18:07 target/release-opt-size/examples/tui*
4ed5dbd3b1242315971eccf8a83b11e5a0509ee8
-rwxr-xr-x 2 sffc primarygroup 3529376 Oct 9 18:13 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6754240 Oct 9 18:13 target/release-opt-size/examples/tui*
(in this period, there is an error while building examples; do not know how that could have landed, perhaps a feature resolution issue)
7f10b155cffcd1bf696a0c71bf128b92883357ae
-rwxr-xr-x 2 sffc primarygroup 1960928 Oct 9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2409400 Oct 9 18:20 target/release-opt-size/examples/tui*
004581fef4fc302dda43ac4123270b9070c61541
-rwxr-xr-x 2 sffc primarygroup 1949864 Oct 9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2396512 Oct 9 18:20 target/release-opt-size/examples/tui*
09ad8b7dd898d5023c6eda35cf65e3357b4debc1
-rwxr-xr-x 2 sffc primarygroup 1945776 Oct 9 18:18 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct 9 18:18 target/release-opt-size/examples/tui*
244a4990b329f9b8cd5678bb8affe71ce0238b16
-rwxr-xr-x 2 sffc primarygroup 1945776 Oct 9 18:14 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct 9 18:14 target/release-opt-size/examples/tui*
d8cc2f3c8a289e4c47192af7df7e3daa30454986
-rwxr-xr-x 2 sffc primarygroup 1957376 Oct 9 18:12 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2387232 Oct 9 18:12 target/release-opt-size/examples/tui*
eb79e5f67a6129430a1a8566c9efa86debb76c78
-rwxr-xr-x 2 sffc primarygroup 1963344 Oct 9 18:10 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2398008 Oct 9 18:10 target/release-opt-size/examples/tui*
I did some manual work to get the intermediate broken commits to build. Doing so, I found that for work_log
, there are two commits that together account for almost the entire regression:
icu_datetime
(#3591)In other words, the examples now use compiled data, whereas before they used buffer provider data from a buffer that was not included as part of the measurement. Mystery solved.
Now, the follow-up question is, do we want to continue measuring the wasm binsize with data included, or do we want to measure it without data? If we measure it with data, we're measuring something that "actually works", whereas if we measure it without data, we can detect regressions in code size at a much more granular level.
Ah of course.
If we call these code size benchmarks we should use empty data. We could have a data size benchmark v2 that is the difference between all data and no data.
Actually for singleton keys there's no such thing as "empty data", and removing all data will also remove fallback logic (and data). I think including just und
could be a good middle ground.
Without the region-select option it's tricky to include exactly one data struct.
@sffc to investigate and implement fix if necessary.
The binsize benchmark shows a regression in the datetime code size between these two dates. It was offline during that time, so it's not entirely clear what the culprit commits were. I imagine that a lot of this is due to calendrical calculations, but we should know for sure so that we can better understand and plan out mitigations.
First step would be to bisect the time between these commits (cfe1f71 and ddb1a7b) and find the commits that most contributed to the spike.
Also, these tests use
TypedZonedDateTimeFormatter::<Gregorian>
, so in theory they shouldn't be impacted by the calendarical calculation code.