ocsigen / js_of_ocaml

Compiler from OCaml to Javascript.
http://ocsigen.org/js_of_ocaml/
Other
952 stars 186 forks source link

Optimize sourcemap processing #1617

Open OlivierNicole opened 4 months ago

OlivierNicole commented 4 months ago

This improves the efficiency of sourcemap processing, chiefly during linking, but also in serializing and deserializing sourcemaps.

Currently, when linking with source maps enabled, js_of_ocaml will decode the source maps for each input file completely into an in-memory explicit mapping, then merge them, then re-encode the merged sourcemap. The merging algorithm involves superlinear sorting operations. However, the format of sourcemaps allows to avoid sorting for the concatenation of generated files.

In addition, all operations performed by linking (concatenation of files, removal and addition of lines) can be performed directly on the encoded form, which further avoids a full parsing to an in-memory representation.

When compiling js_of_ocaml with itself, the final linking step is accelerated by a speedup of about 2.3x with OCaml 5.2.0+fp:

recent trunk (bfbeb692c) this PR
331.8 ms ± 6.1 ms 143.3 ms ± 1.1 ms

fix https://github.com/ocsigen/js_of_ocaml/issues/1446

OlivierNicole commented 4 months ago

I can’t seem to reproduce the CI failure:

File "toplevel/examples/lwt_toplevel/dune", line 2, characters 8-16:
2 |  (names toplevel)
            ^^^^^^^^
(cd _build/default/toplevel/examples/lwt_toplevel && ../../../compiler/bin-js_of_ocaml/js_of_ocaml.exe link --source-map-inline -o toplevel.bc.js .toplevel.eobjs/jsoo/toplevel.bc.runtime.js ../../../.js/!effects+toplevel/stdlib/stdlib.cma.js ../../../.js/!effects+toplevel/compiler-libs.common/ocamlcommon.cma.js ../../../.js/!effects+toplevel/compiler-libs.bytecomp/ocamlbytecomp.cma.js ../../../.js/!effects+toplevel/menhirLib/menhirLib.cma.js ../../../.js/!effects+toplevel/gen/gen.cma.js ../../../.js/!effects+toplevel/sedlex/sedlex.cma.js ../../../.js/!effects+toplevel/yojson/yojson.cma.js ../../../compiler/lib/.js_of_ocaml_compiler.objs/jsoo/!effects+toplevel/js_of_ocaml_compiler.cma.js ../../../lib/runtime/.jsoo_runtime.objs/jsoo/!effects+toplevel/jsoo_runtime.cma.js ../../../lib/js_of_ocaml/.js_of_ocaml.objs/jsoo/!effects+toplevel/js_of_ocaml.cma.js ../../../.js/!effects+toplevel/uutf/uutf.cma.js ../../../.js/!effects+toplevel/re/re.cma.js ../../../.js/!effects+toplevel/tyxml.functor/tyxml_f.cma.js ../../../.js/!effects+toplevel/tyxml/tyxml.cma.js ../../../.js/!effects+toplevel/react/react.cma.js ../../../.js/!effects+toplevel/reactiveData/reactiveData.cma.js ../../../lib/tyxml/.js_of_ocaml_tyxml.objs/jsoo/!effects+toplevel/js_of_ocaml_tyxml.cma.js ../../../compiler/lib-dynlink/.js_of_ocaml_compiler_dynlink.objs/jsoo/!effects+toplevel/js_of_ocaml_compiler_dynlink.cma.js ../../../.js/!effects+toplevel/compiler-libs.toplevel/ocamltoplevel.cma.js ../../lib/.js_of_ocaml_toplevel.objs/jsoo/!effects+toplevel/js_of_ocaml_toplevel.cma.js ../../../.js/!effects+toplevel/lwt/lwt.cma.js ../../../lib/lwt/.js_of_ocaml_lwt.objs/jsoo/!effects+toplevel/js_of_ocaml_lwt.cma.js ../../../.js/!effects+toplevel/graphics/graphics.cma.js ../../../lib/deriving_json/.js_of_ocaml_deriving.objs/jsoo/!effects+toplevel/js_of_ocaml_deriving.cma.js ../../../.js/!effects+toplevel/str/str.cma.js ../../../.js/!effects+toplevel/dynlink/dynlink.cma.js ../../../lib/lwt/graphics/.graphics_js.objs/jsoo/!effects+toplevel/graphics_js.cma.js ../../../.js/!effects+toplevel/ocaml-compiler-libs.common/ocaml_common.cma.js ../../../.js/!effects+toplevel/ppxlib.astlib/astlib.cma.js ../../../.js/!effects+toplevel/stdlib-shims/stdlib_shims.cma.js ../../../.js/!effects+toplevel/ppxlib.ast/ppxlib_ast.cma.js ../../../.js/!effects+toplevel/ocaml-compiler-libs.shadow/ocaml_shadow.cma.js ../../../.js/!effects+toplevel/ppxlib.print_diff/ppxlib_print_diff.cma.js ../../../.js/!effects+toplevel/ppx_derivers/ppx_derivers.cma.js ../../../.js/!effects+toplevel/ppxlib.traverse_builtins/ppxlib_traverse_builtins.cma.js ../../../.js/!effects+toplevel/sexplib0/sexplib0.cma.js ../../../.js/!effects+toplevel/ppxlib.stdppx/stdppx.cma.js ../../../.js/!effects+toplevel/ppxlib/ppxlib.cma.js ../../../ppx/ppx_js/as-lib/.ppx_js.objs/jsoo/!effects+toplevel/ppx_js.cma.js ../../../ppx/ppx_js/lib/.ppx_js_rewriter.objs/jsoo/!effects+toplevel/ppx_js_rewriter.cma.js .toplevel.eobjs/jsoo/dune__exe.cmo.js .toplevel.eobjs/jsoo/dune__exe__B64.cmo.js .toplevel.eobjs/jsoo/dune__exe__Colorize.cmo.js .toplevel.eobjs/jsoo/dune__exe__Graphics_support.cmo.js .toplevel.eobjs/jsoo/dune__exe__Ocp_indent.cmo.js .toplevel.eobjs/jsoo/dune__exe__Indent.cmo.js .toplevel.eobjs/jsoo/dune__exe__Ppx_support.cmo.js .toplevel.eobjs/jsoo/dune__exe__Toplevel.cmo.js ../../../.js/!effects+toplevel/stdlib/std_exit.cmo.js --linkall)
../../../compiler/bin-js_of_ocaml/js_of_ocaml.exe: You found a bug. Please report it at https://github.com/ocsigen/js_of_ocaml/issues :
Error: File "compiler/lib/source_map_io.yojson.ml", line 109, characters 12-18: Assertion failed

This test builds fine on my machine.

Edit: never mind, it was an off-by-one error in an assertion.

OlivierNicole commented 4 months ago

One test is failing because reading “index maps” (composite source maps) is not supported for now—something that is not required for js_of_ocaml to function. I will fix this test soon.

rickyvetter commented 4 months ago

I just tested this change on an incremental build for one of our largest executables and am getting this timing for js_of_ocaml link.

Pre: 2.469s Post: 1.301s

Very cool!

OlivierNicole commented 4 months ago

This is ready for review. I don’t quite understand the build problems on the 5.1 CI, but they don’t seem related to this PR.

OlivierNicole commented 3 months ago

My bad, it actually was due to a missing :standard in a flags field of a Dune stanza. The CI passes now.

OlivierNicole commented 2 months ago

Is anything blocking this PR still?

OlivierNicole commented 2 months ago

My understanding is that the Line_edits change is what gives use the most part of the improvement. Could we achieve similar perf using sections (Index sourcemap) ?

I hoped so initially, but it turns out that Link_js may remove or add metadata comments in the middle of a compilation unit, which pretty much killed my hope of simply using index maps.

OlivierNicole commented 2 months ago

I’m trying to answer some of your comments but my answers are labelled as “pending”, I’m not sure why:

image

OlivierNicole commented 2 months ago

Never mind, apparently I had started a review at some point in the past and never completed it, and apparently Github considers all your comments henceforth as pending

OlivierNicole commented 2 months ago

Just to make sure we are on the same page, my intention is to wait for #1640 to be merged, rebase on it, and then address the rest of your new review (thanks!).

hhugo commented 2 months ago

Just to make sure we are on the same page, my intention is to wait for #1640 to be merged, rebase on it, and then address the rest of your new review (thanks!).

What do you think about the following, on top of #1640.

OlivierNicole commented 2 months ago

drop the indexed sourcemap for now (it seems it's not used)

It is used, to represent the sourcemap of the linker output (in Link_js).

hhugo commented 1 month ago

1640 has been merged, I'll let you rebase this one

OlivierNicole commented 1 month ago

Done

hhugo commented 1 month ago

I'm not very happy with the callback based approach used on Link_js. Without thinking much about it, I would assume that one could reconstruct the edit instruction inside let copy ic oc = ... only.

Maybe we could move the handling/parsing of Sourcemap index into a separate PR to remove some noise in here.

hhugo commented 1 month ago

Regarding the representations of edits. We currently have per line instructions. Have you considered using a different approach (closer to what's done on master with reloc) using "copy n lines starting at pos n into position m" ?

OlivierNicole commented 1 month ago

I considered it, but at the time discarded it as early optimisation. Now that the main source of slowdown as been eliminated, it might be worth trying.