Closed amesgen closed 1 year ago
The GHC ticket's MR already produced the wasm32-wasi-ghc bindist at https://gitlab.haskell.org/ghc/ghc/-/jobs/1358898/artifacts/raw/ghc-x86_64-linux-alpine3_12-cross_wasm32-wasi-release+fully_static.tar.xz. So you may want to give it a try and add a zeroFreeList()
call after hs_perform_gc()
. I expect the wasm-opt
shrinked size will be much improved, and if that's not the case, that'll be a valuable input to my MR to further look for places to zero out.
Results with that bindist (with brotli -5
, which seems to roughly correspond to Netlify brotli):
-H64m
: 29 M (6.1 M) -H64m
: 62 M (14 M)So the first case improves noticeably, but the second one only by a much smaller margin.
Thanks. My guess is that -H64m without --nonmoving-gc will not have the same bloat; in any case, the MR needs extra work to zero out the unused parts in the nonmoving allocator as well.
Stats without --nonmoving-gc
:
-H64m
: 26 M (5.8 M)-H64m
: 62 M (15 M)The GHC MR is undrafted and should contain the proper version of rts_zeroMemory
now. The artifact is at https://gitlab.haskell.org/ghc/ghc/-/jobs/1361884/artifacts/file/ghc-x86_64-linux-alpine3_12-cross_wasm32-wasi-release+fully_static.tar.xz.
New results:
--nonmoving-gc
:
-H64m
: 25.16 M (5.03 M)-H64m
: 22.89 M (4.29 M)--nonmoving-gc
:
-H64m
: 22.98 M (4.70 M)-H64m
: 22.31 M (4.72 M):tada:, surprising that -H64m
actually makes the binary smaller in the first case.
There's a bit of bad news in the upstream GHC patch: I've marked the MR as draft for now, since I managed to construct a test case that still results in size bloat when nonmoving GC is enabled. So it'll take more time to get there. Will ping again when the upstream patch is finished and lands.
Good news: the GHC patch is in its final form, works just fine for nonmoving GC and awaiting review from other GHC core devs.
The GHC patch has been merged, and the bindists landed in the ghc-wasm-meta
update!
rts_clearMemory()
hs_perform_gc()
twice, as explained in the latest tutorialpackageToOps
, packageToPopularity
in initFixityDB
, then zero & free the buffercabal
update, the cabal list-bin
command will now provide the correct executable with .wasm
extension, no need to add the extension manually in the build scriptThanks, updated! The binary sizes seem to be identical to https://github.com/tweag/ormolu/pull/991#issuecomment-1423306991
It's strongly recommended to fully evaluate
packageToOps
,packageToPopularity
ininitFixityDB
, then zero & free the buffer
Is this about preventing the buffer from being retained? I slightly changed/streamlined how initFixityDB
works (no "pointer+len in env var" ugliness necessary anymore); maybe that is already enough. I played around with using deepseq
(diff), which reduced the binary size by <100KB, which is way less than the buffer size.
Is this about preventing the buffer from being retained?
That's true for the original implementation. The current one does not retain buffer anymore so that won't be an issue. That being said, it's still recommended to do deepseq
, the more work is done at pre-init time, the faster it shall be at runtime :)
That being said, it's still recommended to do deepseq, the more work is done at pre-init time, the faster it shall be at runtime :)
Done :+1:
I updated the PR description and squashed the commits; this is now ready to be merged.
This uses Wizer to pre-initialize Ormolu Live WASM (see the section "Using
wizer
to pre-initialize a WASI reactor module" in the ghc-wasm-meta README), including the parsing of the fixity DB.This indeed improves the time it takes to format code with unusual operators for the first time, i.e. if you enter
1 +++++++
both in the current Ormolu Live and the one in this PR (see https://github.com/tweag/ormolu/pull/991#issuecomment-1421285974) and then type another1
to make this a valid expression, the version from this PR is instant, while the one frommaster
has a noticeable delay (but is still pretty fast).OTOH, this slightly increases binary sizes (all sizes are with
wasm-opt -Oz
and, in parentheses, Netlify brotli compression (which is not--best
)):master
: 18.32 MB (3.35 MB)For fairness, one has to consider that the fixity DB is not included in the
master
version, which weighs 1.29 MB (0.22 MB) in the binary format. So we don't get an improvement, but also no significant regression.