rescript-lang / rescript-compiler

The compiler for ReScript.
https://rescript-lang.org
Other
6.68k stars 445 forks source link

Rethink about the runtime encoding of ocaml values in javascript #24

Closed bobzhang closed 8 years ago

bobzhang commented 8 years ago

Goal:

  1. Less work to patch the compiler
  2. OCaml Array is JS Array

Some documentation about the current encoding is [https://github.com/bloomberg/ocamlscript/blob/master/docs%2Fffi.md] here, there is a problem with this encoding is that Pfield i is no longer the same as Parrayref, while the internal OCaml compiler think it is the same, for example stdlib/camlinternalOO.ml, bytecomp/translobj.ml, bytecomp/translclass.ml there might be some other files I am missing. (Recent changes in the trunk of stdlib/camlinternalOO requires us to sync up the change)

So I am thinking of that Obj.tag is not used in too much in the compiler itself(except the GC, which is not relevant in js backend)

So I am proposing that blocks with tag zero is encoded as array, blocks with tag non zero (mostly normal variants) will be encoded as array plus an extra property via Object.defineProperty so that they are polymorphic. and Pfield, Parrayref will behave the same.

for example

type u = 
| B of int * int
|A of int * int * int 

A (1,2,3) will be encoded as Object.defineProperty([1,2,3], {'t', {'value', 1}} B(1,2) will be encoded as [1,2]

bobzhang commented 8 years ago

Note that Object.defineProperty is slow, and variants in general will not be very big, we can have objects like this for A(1,2,3)

{ t : 1, 0 : 1,1:2,2:3}

This encoding actually be efficient in most cases like option type, list and map, set, since they only have two branches for the internal representation

m2ym commented 8 years ago

As far as I know, ocamlscript currently encodes OCaml string (pure byte array) as JS string (UTF16 string?). I think it may go wrong in particular situations. Sorry if unrelated to this issue.

bobzhang commented 8 years ago

@m2ym indeed, there is a corner case in range pattern

function '\000' .. '\255' -> 1 

in ocaml will be compiled into

function _ -> 1

do you have anything else in mind? thanks! (ocamlscript will require -safe-string)

bobzhang commented 8 years ago

we can walk around this issue by x.charCodeAt(i) & 0xff, but I am not sure whether it is worth the effort (some of them can be optimized away)

bobzhang commented 8 years ago

think twice, it might not be a problem, since string is immutable, and is lexed by ocamllex, there is no way you can encode a string out of range, it only happens in FFI, @alainfrisch ?

alainfrisch commented 8 years ago

Yes, you could get strings from JS through the FFI.

bobzhang commented 8 years ago

another benefit of such encoding is that for the FFI, tuple is quite easy to map, even if we goto typedtree we can still guarantee the mapping of tuple, since the field of tag is singled out, it is easy to tell if it is an ocaml object and js object

copy commented 8 years ago

Float arrays and records could be represented using Float64Array. Maybe integer arrays too (using Int32Array).

bobzhang commented 8 years ago

@copy I have similar ideas, but Float64Array is not widely support across js engines, how does js_of_ocaml do the encoding, do you have any idea why it does not use Float64Array if not?

copy commented 8 years ago

I don't know why they don't use it, maybe they haven't considered it yet.

I agree with engine support. It could be enabled conditionally with a flag or shimmed though.

bobzhang commented 8 years ago

@hhugo do you have any suggestion?

bobzhang commented 8 years ago

For the record, encoding variants as js objects will help debugging a lot, for example

function $Cons(x,y){ this._0 = x; this._1 = y; this.t = "Cons" }
$Cons.prototype.toString = function(){return `${this.t} ( ${this._0}, ${this._1})`}

The benefit is that we can make use of prototype chain in js and get a pretty output when debugging, the challenge is that we need know where constructors are called.

Every call site x::y has to be replaced by new $Cons(x,y)

for the record we can do something similar

type t = { x  : int ; y : int}
function $t(x,y} { this._0 = x; this._1 = y; this.lables = ['x','y']}
$t.prototype.toString= function(){...}

we can still preserve the structure comparison semantics of ocaml, the same challenge as we have for encoding variants, if we go this way, it is less work then compiling from typedtree, but get a reasonable output

bobzhang commented 8 years ago

note that the reason that lazy is broken is also due to ocaml compiler internally treat object and array in a uniform way

bobzhang commented 8 years ago

so currently, record, tuple, array, polymorphic variants are all compiled into array(offset 0) normal variants are compiled into array like object

{ 0 : field_0, 1 : field_1, ... n - 1 : field_n_1, length : n , tag : obj.tag}

type option and list are handled specially,

Some x is compiled into [x] x::y is compiled into [x, y]

bmeurer commented 8 years ago

As discussed in email previously, I'd strongly suggest to use object literals with 0 based indexing and the tag, as you described.

{ 0 : field_0, 1 : field_1, ... n - 1 : field_n_1, length : n , tag : obj.tag}

This might be slower in some cases than the array literal version

let o = [field_0, ..., field_n_1]; o.tag = tag;

but I'm sure we (and other VMs) can fix the performance differences when you run into tricky performance problems. I think the object literal version has the most potential for usability (i.e. FFI), readability and consistent performance.

bobzhang commented 8 years ago

@bmeurer that's very kind of you, we will move forward with this encoding for variants

bobzhang commented 8 years ago

For the record, we can use Float64Array when applicable (since operator [] is also overloaded)

bmeurer commented 8 years ago

Actually we talked about this again and came to the conclusion that using an Array literal plus defineProperty for the "tag" might be better currently from a performance perspective.

bobzhang commented 8 years ago

@bmeurer thanks for the info, it's easy for us to switch, you mean like this below?

Object.defineProperty([1,2,3], 'tag', { 'value' : tag_value})

What are the best property settings?

Another question, shall I write a wrap function

let f = (arr, tag)=> Object.defineProperty(arr,'tag',{'value',tag}}
f([1,2,3],tag_value)

or always do the inlining? I guess I should do the inlining for the peephole, right?

bmeurer commented 8 years ago

Yes, I'd also go for

Object.defineProperty(arr, 'tag', {'value':tag_value})

i.e. non-writable, non-configurable 'tag' property. Not sure if the helper function makes sense. It might save you some space, and shouldn't really hurt performance, as it should not get polymorphic as long as tag is always a small integer.

bobzhang commented 8 years ago

@bmeurer thanks, we will go this way, it will not save too much space after being gzipped. for the record, int array is also available in lambda layer, so we can compile ocaml int array to Int32Array when applicable

jordwalke commented 8 years ago

What about creating "prototypical class" constructors for each tag type? When creating objects, you just create "new" instances of those. Every object already inherently has a constructor with its prototype, so this doesn't seem like it would add any more memory allocations. Checking the tag could be as simple as checking reference identity on that constructor/prototype of the object/record etc.

bobzhang commented 8 years ago

@jordwalke can you be more elaborate (with an example), the thing is we don't have type definition. but we can add a name property for debugging experience when available like below:

{ tag : 0, 0 : field_0 , 1 : field_1, name : constructor_name_when_available_just_for_debugging}
bobzhang commented 8 years ago

For the record, actually labels are available when constructing a record. so when compiling record(in debug mode)

we can generate

type t = { x : float; y : float}
let make_t x y = { x ; y }

type  u = A of int 
let make_a x = A x
function make_t (x,y) { 
  return Object.defineProperty([x,y], 'labels', { 'value' : ["x";"y"]}
}
function make_x (x){
  return Obj.defineProperties([x],  { 'name' : { 'value' : 'A'}, 'tag' : 0}}
}

with this, I think we can debug ocaml generated js in js debugger without too much pain(with or without sourcemap)

jordwalke commented 8 years ago

Old reply: Regarding the FFI API, I really don't care too much because I would imagine a toolchain that compiles interop based on Flow type definitions. (Flow is likely a better match than type script because it has non-nullability by default and even polymorphic variants! TypeScript is what you get when C# developers build static typing. Flow is what you get when OCaml developers build static typing - but I digress)

Why is it that you have access to the field labels at debug time, but not for other compilation modes? And yes, just having those debuggable values fulfills the purpose - the actual representation doesn't matter as long as it performs very well.

jordwalke commented 8 years ago

If you benchmark, don't forget to test on JavaScriptCore (JIT and non-JIT). It is one of the best, lightest weight, and most versatile JS engines for mobile where perf matters most. I'm always happy to help people test on JSC (it's already installed on every Mac by default!)

bobzhang commented 8 years ago

@jordwalke in the debug time, field access is still using array. for example type t = {x : string};; (f :t ). x is still f[0], the difference is that in debug mode, we can attach f the labels, so by looking at the debugger, you know f[0] is f["x"] The main motivation that we designed it this way is not for performance, it is that the lable information is not available when destructing the record

jordwalke commented 8 years ago

I understand. I'm asking why that information could not also be used to construct named object keys for the actual representation. (I'm not advocating for this - I would rather you use the fastest runtime performance strategy (considering JSC, V8)). I'm just curious.

bobzhang commented 8 years ago

can you give an example?

jordwalke commented 8 years ago

I don't know what an example would look like because I don't know exactly how you get that field name information. I'm asking if you can construct records as {x: "hello"} in JavaScript using that same debug information (as well as field access being myRecord.x in JavaScript).

bobzhang commented 8 years ago

@jordwalke I took at a look dom.js in flow, it is not type safe, did I miss anything? for example

getElementById (elementId : string) : HTMLElement;

The return type should be nullable?

jordwalke commented 8 years ago

I'm not sure. Perhaps that could be a mistake

bobzhang commented 8 years ago

see the wiki docs

bobzhang commented 8 years ago

@bmeurer hi, I found a serious performance degrade when using {0:a, 1:b, length:2, tag:0} vs [a,b]

jscomp$ time node xx.g.js
start
finish

real    0m14.702s
user    0m14.639s
sys     0m0.093s
jscomp$ time node xx.g.js
start
finish

real    0m1.065s
user    0m1.013s
sys     0m0.053s
jscomp$ node -v
v4.2.1
bobzhang commented 8 years ago

We probably can fix this case -- however, it seems like array like object is much slower than array literal, I did a benchmark -- on node 4.2.1, defineProperty is even slower, any hints?

bmeurer commented 8 years ago

Some operations are clearly faster on arrays than on regular objects. Can you send me the test program?

Hongbo Zhang notifications@github.com schrieb am Sa., 16. Apr. 2016, 16:36:

We probably can fix this case -- however, it seems like array like object is much slower than array literal, I did a benchmark -- on node 4.2.1, defineProperty is even slower, any hints?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/bloomberg/bucklescript/issues/24#issuecomment-210829402

bobzhang commented 8 years ago

cool, I will clean it up and send you a small example

bobzhang commented 8 years ago

hi @bmeurer , so I created two files fast one (using array, around 1s) https://gist.github.com/bobzhang/644ec90fe2d5e77abdc4717313f6eda5 slow one (array like object, around 14s) https://gist.github.com/bobzhang/aed61e9db7c880df83a6b4f098a5fc93

I tried another version Object.defineProperty( ..., 'tag', {'value' : 0 } ), it is even twice slower than the array like one.

The original ocaml code is :

module IntMap = Map.Make(struct
  type t = int
  let compare (x : int) y = x - y
  end)

let test () =
  let m = ref IntMap.empty in
  let count = 1000_000 in
  for i = 0 to count do
    m := IntMap.add i i !m
  done;
  for i = 0 to count  do
    ignore (IntMap.find i !m)
  done

let () =
  (* Js.log "start" ; *)
  test();
  (* Js.log "finish"   *)

Note for this special case, we can optimize it away now (since it has only one non-nullary constructor), but it may apply to general variants though.

jordwalke commented 8 years ago

I would also be curious to see if using arrays to store records results in a speed up! (Also, what about all of these tests on JSC? JSC is very important for mobile)

bobzhang commented 8 years ago

I don't know where to get started with JSC, is there a link?

jordwalke commented 8 years ago

JSC is a binary on your Mac. It's the JS engine that powers a huge portion of mobile web traffic as well as react native applications. I provided the command to run JSC in another issue. I can dig it up.

jordwalke commented 8 years ago

Here's an example of how to run an arbitrary JS bundle via the command line on Mac OS:

 /System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Resources/jsc -e "console = {log:print};" -f yourApp.js

jsc includes several parameters that you can play with to see how fast your JS compilation strategy will perform on various environments. Not all environments allow the JSC JIT, so it's important to test with the JIT off to get a broad sense of the strengths of your compilation strategy.

Try:

--useJIT=false/true
--useFTLJIT=false/true

Any environment that has JIT today, has the FTL JIT, but it's fun to test the impact of FTL (which is a very low level JIT from what I understand). I think your benchmarks will be a good way to compare jsc vs. v8. v8 often gets credit for being the fastest JS engine, but our experiments show that more often than not, JSC is the one that is faster.

bmeurer commented 8 years ago

@bobzhang I'm a bit puzzled, why the object literal version is so much slower, given that there is no specialized array code in it, i.e. a use of any Array builtin or Reflect.apply/Function.prototype.apply, which are known to be way slower on Array-like objects than on Arrays. I'll try to investigate next week and see if we can get this fixed.

Given these numbers, you might want to consider using arrays for now, because even if we can optimize this now, it'll only hit Chrome stable next quarter and the majority of mobile phones are still using an old version of V8 (i.e. via the Samsung sbrowser or even worse the old Android browser).

hhugo commented 8 years ago

https://github.com/ocsigen/js_of_ocaml/issues/117#issuecomment-56951420

jordwalke commented 8 years ago

I tested that example in JSC with various JIT settings (note, my laptop is basically just a glorified iPad).

I also created a new version that uses objects but with keys of the form {stringOne: , stringTwo}. This is to model what would happen if you just output plain JS objects.

Note: V8 only has a JIT. JSC's perf is not great when it has its JIT disabled - but timing for v8 with JIT disabled is essentially infinite because there is no way to run V8 without a JIT (currently, though I eagerly await the V8 that allows non-JIT mode!).

Note: There are environments where JIT is not available as a matter of policy even if your JS engine features a JIT. Namely, iOS (if you embed JSC or any other JS engine in an iOS app, you may not use the JIT - Safari is given an exception to the rule).

Engine Array Keys Numeric Object Keys String Object Keys
JSC --useFTLJIT=true 1,448ms 5,279ms 1,300ms
JSC --useFTLJIT=false 1,790ms 5,320ms 1,630ms
JSC --useJIT=false 10,500ms 15,000ms 5,900ms
V8 1,390ms 14,000ms 1,070ms
OCaml Compiler Time
ocamlc 4.02.3 2,700ms
ocamlopt 4.02.3 500ms

I'll just say that I've had a very hard time maintaining predictability with V8's JIT when using "creative" keys. We had a 10x regression in React when our JS Objects included non-standard characters in the object keys (because the JIT would thrash back and forth deopt/opt/deopt/opt). We found the issue by logging JIT operations and seeing the same code paths become optimized then deoptimized (again and again).

It seems like plain JS keys work the best in general, however, I would include one warning. This small test case won't reveal some of the issues that come up with regular JS keys. Many JS engines attempt to form a "type hierarchy" of JS objects. You want to avoid creating what appears to be a type hierarchy, but is really not - which could cause deopts. So simply creating JS objects of the shape {"stringOne":, "stringTwo": } is not advised. Ideally, you would create JS objects that include the unique type of the original OCaml record to make its keys completely unique and never confused for another type ({"hashTbl_x":, "hashTbl_y"}). If that is not possible, it seems jsoo's approach of using Arrays might be best if you can avoid the extra boxing/allocation. I think it would be worth looking into typed arrays as well.

Update: I tried logging deopts in V8 and did not see anything interesting.

bobzhang commented 8 years ago

hi @bmeurer , thanks for having a look. I would like to learn more about your insight before making any changes. Currently all collections in ocaml stdlib (array, map, set, list, option, hashtbl) are optimized, so this would not hurt our performance in most cases. But I am interested in which parts make it so slow, is the allocation, or later access key, is it possible to add a specialized optimization on such representation(should be sound but not complete)? (I also tried node V5.10.1, it seems slightly worse in slow.js, and slightly better in fast.js) @jordwalke thanks for the benchmark, it is interesting that JSC seems to be more stable on this example

copy commented 8 years ago

In v8, objects with keys that aren't valid identifiers are stored as hashtables, see https://github.com/petkaantonov/bluebird/wiki/Optimization-killers#52-the-object-being-iterated-is-not-a-simple-enumerable (objects with keys that are valid identifiers are stored using hidden classes).

bobzhang commented 8 years ago

@bmeurer @jordwalke I also tried two versions:

// slow3.js
function variant5(a0,a,1,a2,a3,a4){
   var x = [a0,a1,a2,a3,a4];
   x.tag = 0;
  return x
}
// slow4.js
function variant5(a0,a1,a2,a3,a4){
  var x = [a0;a1;a2;a3;a4];
  return Object.defineProperty(x, 'tag', {'value' : 0 });
}
ocamlscript$ time node ~/git/test/slow3.js

real    0m1.433s
user    0m1.350s
sys     0m0.064s
ocamlscript$ time node ~/git/test/slow4.js

real    0m24.669s
user    0m24.622s
sys     0m0.084s
ocamlscript$ time /System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Resources/jsc -e "console = {log:print};" -f ~/git/test/slow3.js

real    0m2.747s
user    0m3.025s
sys     0m0.356s
ocamlscript$ time /System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Resources/jsc -e "console = {log:print};" -f ~/git/test/slow4.js

real    0m5.062s
user    0m5.324s
sys     0m0.362s

So slow3 is pretty good and two~three times faster than immutablejs and mori (even without our current optimizations), so the question is what is the best way to patch an array (potentially typed array) without too much performance penalty ? Current encoding (compared with js_of_ocaml) is better for FFI, delivers best performance for common collections in stdlib, I would really like to keep it, but hope performance penalty would not too bad for complex variants (more than 2 non-nullary data constructors) -- note slow3 is quite acceptable, I am just curious how it is different from slow4.

Btw, I think javascript VM engineers are doing an amazing job, it is even much faster than OCaml bytecode

bobzhang commented 8 years ago

@jordwalke here we are talking about variants not records though. it's possible to use unique-labels for records, but it is hard to make it be happy with Obj module which 1% of those code will break everything

jordwalke commented 8 years ago

Oh do not test using UNIX time command. You must create timers using Date.now in the test case. Otherwise you are testing the fixed overhead if VM startup time.