Closed bobzhang closed 8 years ago
Note that Object.defineProperty
is slow, and variants in general will not be very big, we can have objects like this for A(1,2,3)
{ t : 1, 0 : 1,1:2,2:3}
This encoding actually be efficient in most cases like option
type, list
and map
, set
, since they only have two branches for the internal representation
As far as I know, ocamlscript currently encodes OCaml string (pure byte array) as JS string (UTF16 string?). I think it may go wrong in particular situations. Sorry if unrelated to this issue.
@m2ym indeed, there is a corner case in range pattern
function '\000' .. '\255' -> 1
in ocaml will be compiled into
function _ -> 1
do you have anything else in mind? thanks! (ocamlscript will require -safe-string)
we can walk around this issue by x.charCodeAt(i) & 0xff
, but I am not sure whether it is worth the effort (some of them can be optimized away)
think twice, it might not be a problem, since string is immutable, and is lexed by ocamllex, there is no way you can encode a string out of range, it only happens in FFI, @alainfrisch ?
Yes, you could get strings from JS through the FFI.
another benefit of such encoding is that for the FFI, tuple is quite easy to map, even if we goto typedtree we can still guarantee the mapping of tuple, since the field of tag
is singled out, it is easy to tell if it is an ocaml object and js object
Float arrays and records could be represented using Float64Array
. Maybe integer arrays too (using Int32Array
).
@copy I have similar ideas, but Float64Array
is not widely support across js engines, how does js_of_ocaml
do the encoding, do you have any idea why it does not use Float64Array
if not?
I don't know why they don't use it, maybe they haven't considered it yet.
I agree with engine support. It could be enabled conditionally with a flag or shimmed though.
@hhugo do you have any suggestion?
For the record, encoding variants as js objects will help debugging a lot, for example
function $Cons(x,y){ this._0 = x; this._1 = y; this.t = "Cons" }
$Cons.prototype.toString = function(){return `${this.t} ( ${this._0}, ${this._1})`}
The benefit is that we can make use of prototype chain in js and get a pretty output when debugging, the challenge is that we need know where constructors are called.
Every call site x::y
has to be replaced by new $Cons(x,y)
for the record we can do something similar
type t = { x : int ; y : int}
function $t(x,y} { this._0 = x; this._1 = y; this.lables = ['x','y']}
$t.prototype.toString= function(){...}
we can still preserve the structure comparison semantics of ocaml, the same challenge as we have for encoding variants, if we go this way, it is less work then compiling from typedtree, but get a reasonable output
note that the reason that lazy
is broken is also due to ocaml compiler internally treat object and array in a uniform way
so currently, record, tuple, array, polymorphic variants are all compiled into array(offset 0) normal variants are compiled into array like object
{ 0 : field_0, 1 : field_1, ... n - 1 : field_n_1, length : n , tag : obj.tag}
type option and list are handled specially,
Some x
is compiled into [x]
x::y
is compiled into [x, y]
As discussed in email previously, I'd strongly suggest to use object literals with 0 based indexing and the tag, as you described.
{ 0 : field_0, 1 : field_1, ... n - 1 : field_n_1, length : n , tag : obj.tag}
This might be slower in some cases than the array literal version
let o = [field_0, ..., field_n_1]; o.tag = tag;
but I'm sure we (and other VMs) can fix the performance differences when you run into tricky performance problems. I think the object literal version has the most potential for usability (i.e. FFI), readability and consistent performance.
@bmeurer that's very kind of you, we will move forward with this encoding for variants
For the record, we can use Float64Array
when applicable (since operator []
is also overloaded)
Actually we talked about this again and came to the conclusion that using an Array literal plus defineProperty for the "tag" might be better currently from a performance perspective.
@bmeurer thanks for the info, it's easy for us to switch, you mean like this below?
Object.defineProperty([1,2,3], 'tag', { 'value' : tag_value})
What are the best property settings?
Another question, shall I write a wrap function
let f = (arr, tag)=> Object.defineProperty(arr,'tag',{'value',tag}}
f([1,2,3],tag_value)
or always do the inlining? I guess I should do the inlining for the peephole, right?
Yes, I'd also go for
Object.defineProperty(arr, 'tag', {'value':tag_value})
i.e. non-writable, non-configurable 'tag' property. Not sure if the helper function makes sense. It might save you some space, and shouldn't really hurt performance, as it should not get polymorphic as long as tag is always a small integer.
@bmeurer thanks, we will go this way, it will not save too much space after being gzipped. for the record, int array is also available in lambda layer, so we can compile ocaml int array to Int32Array when applicable
What about creating "prototypical class" constructors for each tag type? When creating objects, you just create "new" instances of those. Every object already inherently has a constructor with its prototype, so this doesn't seem like it would add any more memory allocations. Checking the tag could be as simple as checking reference identity on that constructor/prototype of the object/record etc.
@jordwalke can you be more elaborate (with an example), the thing is we don't have type definition. but we can add a name property for debugging experience when available like below:
{ tag : 0, 0 : field_0 , 1 : field_1, name : constructor_name_when_available_just_for_debugging}
For the record, actually labels are available when constructing a record. so when compiling record(in debug mode)
we can generate
type t = { x : float; y : float}
let make_t x y = { x ; y }
type u = A of int
let make_a x = A x
function make_t (x,y) {
return Object.defineProperty([x,y], 'labels', { 'value' : ["x";"y"]}
}
function make_x (x){
return Obj.defineProperties([x], { 'name' : { 'value' : 'A'}, 'tag' : 0}}
}
with this, I think we can debug ocaml generated js in js debugger without too much pain(with or without sourcemap)
Old reply: Regarding the FFI API, I really don't care too much because I would imagine a toolchain that compiles interop based on Flow type definitions. (Flow is likely a better match than type script because it has non-nullability by default and even polymorphic variants! TypeScript is what you get when C# developers build static typing. Flow is what you get when OCaml developers build static typing - but I digress)
Why is it that you have access to the field labels at debug time, but not for other compilation modes? And yes, just having those debuggable values fulfills the purpose - the actual representation doesn't matter as long as it performs very well.
If you benchmark, don't forget to test on JavaScriptCore (JIT and non-JIT). It is one of the best, lightest weight, and most versatile JS engines for mobile where perf matters most. I'm always happy to help people test on JSC (it's already installed on every Mac by default!)
@jordwalke in the debug time, field access is still using array. for example type t = {x : string};; (f :t ). x
is still f[0]
, the difference is that in debug mode, we can attach f
the labels, so by looking at the debugger, you know f[0]
is f["x"]
The main motivation that we designed it this way is not for performance, it is that the lable information is not available when destructing the record
I understand. I'm asking why that information could not also be used to construct named object keys for the actual representation. (I'm not advocating for this - I would rather you use the fastest runtime performance strategy (considering JSC, V8)). I'm just curious.
can you give an example?
I don't know what an example would look like because I don't know exactly how you get that field name information. I'm asking if you can construct records as {x: "hello"}
in JavaScript using that same debug information (as well as field access being myRecord.x
in JavaScript).
@jordwalke I took at a look dom.js in flow, it is not type safe, did I miss anything? for example
getElementById (elementId : string) : HTMLElement;
The return type should be nullable?
I'm not sure. Perhaps that could be a mistake
see the wiki docs
@bmeurer hi, I found a serious performance degrade when using {0:a, 1:b, length:2, tag:0}
vs [a,b]
jscomp$ time node xx.g.js
start
finish
real 0m14.702s
user 0m14.639s
sys 0m0.093s
jscomp$ time node xx.g.js
start
finish
real 0m1.065s
user 0m1.013s
sys 0m0.053s
jscomp$ node -v
v4.2.1
We probably can fix this case -- however, it seems like array like object is much slower than array literal, I did a benchmark -- on node 4.2.1, defineProperty is even slower, any hints?
Some operations are clearly faster on arrays than on regular objects. Can you send me the test program?
Hongbo Zhang notifications@github.com schrieb am Sa., 16. Apr. 2016, 16:36:
We probably can fix this case -- however, it seems like array like object is much slower than array literal, I did a benchmark -- on node 4.2.1, defineProperty is even slower, any hints?
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/bloomberg/bucklescript/issues/24#issuecomment-210829402
cool, I will clean it up and send you a small example
hi @bmeurer , so I created two files fast one (using array, around 1s) https://gist.github.com/bobzhang/644ec90fe2d5e77abdc4717313f6eda5 slow one (array like object, around 14s) https://gist.github.com/bobzhang/aed61e9db7c880df83a6b4f098a5fc93
I tried another version Object.defineProperty( ..., 'tag', {'value' : 0 } )
, it is even twice slower than the array like one.
The original ocaml code is :
module IntMap = Map.Make(struct
type t = int
let compare (x : int) y = x - y
end)
let test () =
let m = ref IntMap.empty in
let count = 1000_000 in
for i = 0 to count do
m := IntMap.add i i !m
done;
for i = 0 to count do
ignore (IntMap.find i !m)
done
let () =
(* Js.log "start" ; *)
test();
(* Js.log "finish" *)
Note for this special case, we can optimize it away now (since it has only one non-nullary constructor), but it may apply to general variants though.
I would also be curious to see if using arrays to store records results in a speed up! (Also, what about all of these tests on JSC? JSC is very important for mobile)
I don't know where to get started with JSC, is there a link?
JSC is a binary on your Mac. It's the JS engine that powers a huge portion of mobile web traffic as well as react native applications. I provided the command to run JSC in another issue. I can dig it up.
Here's an example of how to run an arbitrary JS bundle via the command line on Mac OS:
/System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Resources/jsc -e "console = {log:print};" -f yourApp.js
jsc
includes several parameters that you can play with to see how fast your JS compilation strategy will perform on various environments. Not all environments allow the JSC JIT, so it's important to test with the JIT off to get a broad sense of the strengths of your compilation strategy.
Try:
--useJIT=false/true
--useFTLJIT=false/true
Any environment that has JIT
today, has the FTL JIT
, but it's fun to test the impact of FTL
(which is a very low level JIT
from what I understand).
I think your benchmarks will be a good way to compare jsc
vs. v8
. v8
often gets credit for being the fastest JS engine, but our experiments show that more often than not, JSC
is the one that is faster.
@bobzhang I'm a bit puzzled, why the object literal version is so much slower, given that there is no specialized array code in it, i.e. a use of any Array builtin or Reflect.apply/Function.prototype.apply, which are known to be way slower on Array-like objects than on Arrays. I'll try to investigate next week and see if we can get this fixed.
Given these numbers, you might want to consider using arrays for now, because even if we can optimize this now, it'll only hit Chrome stable next quarter and the majority of mobile phones are still using an old version of V8 (i.e. via the Samsung sbrowser or even worse the old Android browser).
I tested that example in JSC with various JIT settings (note, my laptop is basically just a glorified iPad).
I also created a new version that uses objects but with keys of the form {stringOne: , stringTwo}
. This is to model what would happen if you just output plain JS objects.
Note: V8 only has a JIT. JSC's perf is not great when it has its JIT disabled - but timing for v8 with JIT disabled is essentially infinite because there is no way to run V8 without a JIT (currently, though I eagerly await the V8 that allows non-JIT mode!).
Note: There are environments where JIT is not available as a matter of policy even if your JS engine features a JIT. Namely, iOS (if you embed JSC or any other JS engine in an iOS app, you may not use the JIT - Safari is given an exception to the rule).
Engine | Array Keys | Numeric Object Keys | String Object Keys |
---|---|---|---|
JSC --useFTLJIT=true |
1,448ms |
5,279ms |
1,300ms |
JSC --useFTLJIT=false |
1,790ms |
5,320ms |
1,630ms |
JSC --useJIT=false |
10,500ms |
15,000ms |
5,900ms |
V8 |
1,390ms |
14,000ms |
1,070ms |
OCaml Compiler | Time |
---|---|
ocamlc 4.02.3 |
2,700ms |
ocamlopt 4.02.3 |
500ms |
I'll just say that I've had a very hard time maintaining predictability with V8's JIT when using "creative" keys. We had a 10x regression in React when our JS Objects included non-standard characters in the object keys (because the JIT would thrash back and forth deopt/opt/deopt/opt). We found the issue by logging JIT operations and seeing the same code paths become optimized then deoptimized (again and again).
It seems like plain JS keys work the best in general, however, I would include one warning. This small test case won't reveal some of the issues that come up with regular JS keys. Many JS engines attempt to form a "type hierarchy" of JS objects. You want to avoid creating what appears to be a type hierarchy, but is really not - which could cause deopts. So simply creating JS objects of the shape {"stringOne":, "stringTwo": }
is not advised. Ideally, you would create JS objects that include the unique type of the original OCaml record to make its keys completely unique and never confused for another type ({"hashTbl_x":, "hashTbl_y"}
). If that is not possible, it seems jsoo's approach of using Arrays might be best if you can avoid the extra boxing/allocation. I think it would be worth looking into typed arrays as well.
Update: I tried logging deopts in V8 and did not see anything interesting.
hi @bmeurer , thanks for having a look. I would like to learn more about your insight before making any changes. Currently all collections in ocaml stdlib (array, map, set, list, option, hashtbl) are optimized, so this would not hurt our performance in most cases. But I am interested in which parts make it so slow, is the allocation, or later access key, is it possible to add a specialized optimization on such representation(should be sound but not complete)? (I also tried node V5.10.1, it seems slightly worse in slow.js, and slightly better in fast.js) @jordwalke thanks for the benchmark, it is interesting that JSC seems to be more stable on this example
In v8, objects with keys that aren't valid identifiers are stored as hashtables, see https://github.com/petkaantonov/bluebird/wiki/Optimization-killers#52-the-object-being-iterated-is-not-a-simple-enumerable (objects with keys that are valid identifiers are stored using hidden classes).
@bmeurer @jordwalke I also tried two versions:
// slow3.js
function variant5(a0,a,1,a2,a3,a4){
var x = [a0,a1,a2,a3,a4];
x.tag = 0;
return x
}
// slow4.js
function variant5(a0,a1,a2,a3,a4){
var x = [a0;a1;a2;a3;a4];
return Object.defineProperty(x, 'tag', {'value' : 0 });
}
ocamlscript$ time node ~/git/test/slow3.js
real 0m1.433s
user 0m1.350s
sys 0m0.064s
ocamlscript$ time node ~/git/test/slow4.js
real 0m24.669s
user 0m24.622s
sys 0m0.084s
ocamlscript$ time /System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Resources/jsc -e "console = {log:print};" -f ~/git/test/slow3.js
real 0m2.747s
user 0m3.025s
sys 0m0.356s
ocamlscript$ time /System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Resources/jsc -e "console = {log:print};" -f ~/git/test/slow4.js
real 0m5.062s
user 0m5.324s
sys 0m0.362s
So slow3
is pretty good and two~three times faster than immutablejs
and mori
(even without our current optimizations), so the question is what is the best way to patch an array (potentially typed array) without too much performance penalty ?
Current encoding (compared with js_of_ocaml) is better for FFI, delivers best performance for common collections in stdlib, I would really like to keep it, but hope performance penalty would not too bad for complex variants (more than 2 non-nullary data constructors) -- note slow3
is quite acceptable, I am just curious how it is different from slow4
.
Btw, I think javascript VM engineers are doing an amazing job, it is even much faster than OCaml bytecode
@jordwalke here we are talking about variants not records though. it's possible to use unique-labels for records, but it is hard to make it be happy with Obj
module which 1% of those code will break everything
Oh do not test using UNIX time command. You must create timers using Date.now in the test case. Otherwise you are testing the fixed overhead if VM startup time.
Goal:
Some documentation about the current encoding is [https://github.com/bloomberg/ocamlscript/blob/master/docs%2Fffi.md] here, there is a problem with this encoding is that
Pfield i
is no longer the same asParrayref
, while the internal OCaml compiler think it is the same, for examplestdlib/camlinternalOO.ml
,bytecomp/translobj.ml
,bytecomp/translclass.ml
there might be some other files I am missing. (Recent changes in the trunk ofstdlib/camlinternalOO
requires us to sync up the change)So I am thinking of that
Obj.tag
is not used in too much in the compiler itself(except the GC, which is not relevant in js backend)So I am proposing that
blocks with tag zero
is encoded as array, blocks with tag non zero (mostly normal variants) will be encoded as array plus an extra property viaObject.defineProperty
so that they are polymorphic. andPfield
,Parrayref
will behave the same.for example
A (1,2,3)
will be encoded asObject.defineProperty([1,2,3], {'t', {'value', 1}}
B(1,2)
will be encoded as[1,2]