ocsigen / js_of_ocaml

Compiler from OCaml to Javascript.
http://ocsigen.org/js_of_ocaml/
Other
943 stars 185 forks source link

[BUG] js_of_ocaml is excessively memory hungry #1612

Closed JasonGross closed 2 weeks ago

JasonGross commented 2 months ago

Describe the bug

js_of_ocaml is very cool! I use it on CI to generate a webpage. However, I cannot use it on the new GitHub Actions arm64 MacOS boxes, which have only 7 GB of RAM, because it sometimes eats 8--9 GB RAM to generate a single .js file. For example here is a table of build times and memory usages on linux:

     Time |   Peak Mem | File Name                                         
---------------------------------------------------------------------------
15m11.31s | 8904232 ko | Total Time / Peak Mem                             
---------------------------------------------------------------------------
 4m46.85s | 8808932 ko | ExtractionJsOfOCaml/bedrock2_fiat_crypto.js       
 4m37.20s | 7151532 ko | ExtractionJsOfOCaml/fiat_crypto.js                
 4m30.42s | 8904232 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.js  
 0m25.80s | 3195952 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.byte
 0m25.32s | 3193600 ko | ExtractionJsOfOCaml/bedrock2_fiat_crypto.byte     
 0m24.60s | 2814364 ko | ExtractionJsOfOCaml/fiat_crypto.byte              
 0m00.38s |  102780 ko | ExtractionJsOfOCaml/bedrock2_fiat_crypto.cmi      
 0m00.37s |   98468 ko | ExtractionJsOfOCaml/fiat_crypto.cmi               
 0m00.37s |  102360 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.cmi 

It is similar on mac, and a bit better on debian sid.

I invoke it with --source-map --no-inline --enable=effects and invoke the compiler with -package js_of_ocaml -package unix -w -20 -g

For the near future (until artifacts expire), the build artifacts page contains generated .js files (fiat-html-js-of-ocaml), .ml source files (ExtractionJsOfOCaml-source-master), and compiled files (ExtractionJsOfOCaml-master-ocaml-4.11.1).

Expected behavior I expect there to be a way to make the js_of_ocaml pipeline fit in under 7GB of RAM, possibly with a flag, if necessary.

Versions js_of_ocaml 5.7.2, ocaml 4.11.1

hhugo commented 2 months ago

With the change mentioned, I the following for ExtractionJsOfOCaml/bedrock2_fiat_crypto.js

112.97user 1.26system 1:54.24elapsed 99%CPU (0avgtext+0avgdata 4581104maxresident)k
0inputs+21904outputs (0major+1275732minor)pagefaults 0swaps
hhugo commented 2 months ago

with #1614, one no longer need to disable globaldeadcode, at the cost of extra memory usage.

78.16user 1.85system 1:20.13elapsed 99%CPU (0avgtext+0avgdata 6034576maxresident)k
0inputs+19696outputs (0major+1631738minor)pagefaults 0swaps

I'll try to improve #1614

hhugo commented 2 months ago

I've updated #1614, I now get

73.86user 1.00system 1:14.87elapsed 99%CPU (0avgtext+0avgdata 4581396maxresident)k
0inputs+19696outputs (0major+1209270minor)pagefaults 0swaps

I still need to investigate the sourcemap issue.

Can you test #1614 and confirm it solves part of your issue ?

OlivierNicole commented 1 month ago

@hhugo Regarding this, I have been working on the sourcemap slowness and will open a PR today.

hhugo commented 1 month ago

@JasonGross, any luck with #1614 ?

JasonGross commented 1 month ago

I have not had a chance to try it, but if it works on your end, I don't see why it would be any different on GitHub Actions. (The files I linked to are the ones I actually use, not simplified examples .). But I can set up GHA to use the PR. Should I just clone the repo and opam pin add . on that branch?

OlivierNicole commented 1 month ago

1617 may help too if the memory consumption happens to be in sourcemaps. I find that it nearly halves the peak memory usage when linking JSOO using itself.

hhugo commented 1 month ago

@JasonGross, I think you can just do

opam pin add js_of_ocaml-compiler https://github.com/ocsigen/js_of_ocaml.git#speedup
JasonGross commented 1 month ago

I've set it up on CI: pre (specifically here) post speedup (mit-plv/fiat-crypto#1922) (#1614) (still in progress) post optim_sourcemap_link (mit-plv/fiat-crypto#1923) (#1617) (still in progress)

OlivierNicole commented 1 month ago

It looks both #1614 and #1617 make both the run time and the peak memory usage worse. I haven’t worked on #1614 but that surprises me a lot in the case #1617. Are these tests runnable on Linux? I may try to inspect them locally.

hhugo commented 1 month ago

Not all CI jobs have been updated.

Here is what I see for #1614

1m28.92s | 4461616 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.js

hhugo commented 1 month ago

And for #1617

4m47.42s | 5253024 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.js

hhugo commented 1 month ago

compared to

6m08.88s | 7720996 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.js

hhugo commented 1 month ago

@OlivierNicole, I would expect your PR to only affect separate compilation during the link step but I don't think separate compilation is involved here. What part of your PR would improve the situation during whole program compilation ?

OlivierNicole commented 1 month ago

Not all CI jobs have been updated.

Here is what I see for #1614

1m28.92s | 4461616 ko | ExtractionJsOfOCaml/with_bedrock2_fiat_crypto.js

I didn’t quite follow which of the many jobs to inspect to find the info, but I trust that your numbers are right.

@OlivierNicole, I would expect your PR to only affect separate compilation during the link step but I don't think separate compilation is involved here. What part of your PR would improve the situation during whole program compilation ?

I’m honestly not sure. Looking into it.

OlivierNicole commented 1 month ago

I switched to `Stringlit  and Yojson.Raw (rather than `String and Yojson.Basic) because it saves a non-negligible amount of time on the parsing and the writing of the mappings fields of source maps (essentially, Yojson.Basic.to_string (`String s) checks for special characters or Unicode code points in the string, which takes a suprising amount of time and is unnecessary on mappings since they contain only base64 numbers, commas and semicolons.

I’m not sure it explains it all though. Trying to profile locally.

JasonGross commented 1 month ago

Are these tests runnable on Linux?

Yes. The cheapest way to run them is to download any of the artists labeled ExtractionJsOfOCaml-source* from our CI. These artifacts contain a handful of self-contained .ml files, the ones that we want to turn into .js files. I gave the flags I use in the initial post.

The expensive way to run the tests is to clone the repo, do opam install coq, and then run something like make js-of-ocaml

OlivierNicole commented 1 month ago

I can’t reproduce a significant difference in terms of run time nor profile between master and #1617. The time spent on source maps is negligible compared to the time spent optimizing. I’m starting to suspect that the CI run times have a huge variance.

OlivierNicole commented 1 month ago

P.S. I’ve done the test on with_bedrock2_fiat_crypto. Peak memory usage is not significantly affected, either.

hhugo commented 2 weeks ago

1614 has been merged. Reopen if you still have issues