kylebarron / parquet-wasm

Rust-based WebAssembly bindings to read and write Apache Parquet data
https://kylebarron.dev/parquet-wasm/
Apache License 2.0
483 stars 19 forks source link

CI production build size summary #401

Closed H-Plus-Time closed 7 months ago

H-Plus-Time commented 7 months ago

I would have liked to do a head vs target % change (apparently there's a way involving artifacts), but this is a sufficiently useful gauge. Presumably the artifacts required for that would also be used in package publishing.

As far as I can tell, it's only the JS bindings that differ between targets, so we can probably just do pkg/esm/*.wasm and be done with it.

H-Plus-Time commented 7 months ago

Top level of the workflow for the PR/commit (disappointingly, 3 clicks from a PR (show all checks, pick anything, hit summary top left)) - this one's at https://github.com/kylebarron/parquet-wasm/actions/runs/7068033091. There's probably/hopefully a Github app that surfaces it in the PR (like Dependabot) :-/.

Also, pv apparently uses base-2 SI prefixes (about 5% deviation at the MiB vs MB level) 😞

H-Plus-Time commented 7 months ago

Excellent, found one, adding it now.

kylebarron commented 7 months ago

are the failing runs expected?

H-Plus-Time commented 7 months ago

Yeah, the safeguards in Github Actions prevented the comment creation from going through - I've added a workflow_run indirection (described at https://securitylab.github.com/research/github-actions-preventing-pwn-requests/, I figure this particular bit of CI is largely set and forget, so the added hoops are fine).

The comment workflow won't take effect until it's merged, this dummy PR demonstrates what it should look like (and the comment upsert behaviour): https://github.com/H-Plus-Time/parquet-wasm/pull/2

kylebarron commented 7 months ago

That looks great! I think one additional improvement that would be great is having sizes for a couple different build options: i.e.

Then you'd also be able to see whether changes are affecting only the size in the async bundle, for example.

A delta with the previous known size would also be cool, but that sounds pretty hard, and it's not too hard to find the most recent PR.

I'm happy to merge this as is though, what do you think?

H-Plus-Time commented 7 months ago

:thinking: yeah, though it looks like we'd have to use a synthetic full target (the one in cargo.toml breaks with symbol collisions, presumably arrow1, arrow2 in the same bundle).

I ended up just switching away from scripts/build.sh (one platform target, multiple feature flags is relevant for reporting, while multiple targets, fixed feature flags is relevant for publishing). Since it's a new script, I figured I'd just ignore the arrow2 build as well (trivial to add it anyway, if we still want to keep an eye on build size for the duration of the deprecation window).

A delta with the previous known size would also be cool, but that sounds pretty hard, and it's not too hard to find the most recent PR.

It looks like the deltas are fully functional (and correct) too :tada: .

Also, apparently, closing and re-opening PRs (so long as they still have un-merged changes) is sufficient to trigger fresh workflows (with the results based on the target branch), so that's all we have to do with the motivating example (#400 ) to see this in action.

H-Plus-Time commented 7 months ago

The comment output from the equivalent PR in my fork:

Asset Sizes

AssetUncompressed SizeCompressed Size
async_full/arrow1_bg.wasm5.21MB $\color{red}\text{+174KB +3\%}$1.22MB $\color{red}\text{+62.3KB +5\%}$
slim/arrow1_bg.wasm4.6MB $\color{red}\text{+12.2KB +0\%}$1.04MB $\color{green}\text{-2.12KB -0\%}$
sync/arrow1_bg.wasm4.6MB $\color{red}\text{+12.2KB +0\%}$1.04MB $\color{green}\text{-2.12KB -0\%}$
kylebarron commented 7 months ago

arrow2 hasn't gotten a commit in a month and at this point I consider it on its way to dying, sadly. So I'm content with deprecating and removing arrow2 as soon as the arrow1 bindings have a superset of functionality

kylebarron commented 7 months ago

Thanks again!

kylebarron commented 7 months ago

Hmm looks like this failed on my PR . https://github.com/kylebarron/parquet-wasm/actions/runs/7215812740/job/19660787302

image

Maybe that's because there wasn't a previous summary to download from?