Closed bobzhang closed 8 years ago
Why is it difficult to make unique labels for variants and the Obj module. I'd be interested in helping come up with a solution if it allows better performance.
it's ok for variants, the thing is that some compiler internals (mostly in translclass/translobj) assumes block is an array, unless the upstream lend a hand, it will be a headache to maintain those patches
It would be fine to have them continue to be delivered as an array, as long as every type chose a distinct index range.
So, a point
{x=21; y=22}
would have an array with indices [0=>21, 1=>22]
, but then a user
{age=21; name="bob"}
would have an array with indices [9=>21, 10=>"bob"]
.
We don't need to know the actual original "types", we just need to guarantee that the index ranges (for creation and access) are consistent for two data p
and q
if and only if p
and q
have the same type.
Of course, you could then normalize all the ranges back to start at zero in the typical compiler.
Arrays with holes will waste space and are in general slower than packed arrays, although likely still a bit faster than regular objects with array index like properties.
@bobzhang Object.defineProperty in a hot function will always be way slower than adding a property via a store to a previously defined literal. slow3.js usually translates to highly efficient code that just allocates and initializes the literal with slack space for the tag property, and then transitions the object to a new hidden class and stores the tag value. So that's just bump pointer allocation plus a bunch of machine level stores. While slow4.js always goes to the runtime (switching to C++) for the Object.defineProperty call.
@bobzhang Ok, I figured out what's causing the slow down in slow.js vs fast.js. The access to the objects is properly optimized (in V8), but the allocation of the literal is currently not. So what happens is that we generate fast code for [a,...,z], but need to fallback to the generic allocation path (which is super slow compared to the inline path) for {0:a, ..., n-1:z}. I'm not exactly sure why we do this tho, as there doesn't seem to be an obvious reason why we can't support both in the inline path. Maybe it's just because that didn't turn out to be relevant (and somehow in the back of my head I was pretty sure we already fixed this some time ago).
@bmeurer:
Arrays with holes will waste space and are in general slower than packed arrays, although likely still a bit faster than regular objects with array index like properties.
I was not suggesting that there be actual holes in the arrays that are allocated in JavaScript. I was only suggesting that holes be places in the index ranges in the compiler's intermediate representation. Those holes are only there to ensure that intermediate representations maintain distinct "meaning" for various offsets. We don't need to know everything about the type at this later stage of the compiler - only its memory layout, and some hole starting index that uniquely classifies which other structures it is compatible with. I would then suggest taking those hole-ridden ranges, and then converting them into plain Objects as follows:
var obj = {
field55: 21,
field56: "bob"
};
This has all the benefits of the third test case that I created called "String Object Keys" above, but without the issue that JIT optimizers may have their hidden classes confused by every structure having the fields located at keys "str0", "str1", "str2"
.
The actual native ocaml compiler would want to disregard those holes. I'm merely suggesting a way that, via index ranges, everything we needed to know about the distinct type can be conveyed without actually having to track the type through the various intermediate representations.
For every possible engine, including legacy engines deployed to node/browsers, it seems this would be optimal, correct?
@jordwalke Indeed, that's a good suggestion, and I suppose it will be optimizable on all engines.
@bmeurer , thanks for looking, is there any downside with respect to slow3.js
?
Suppose v8 do the inline path for {0:.., n : }
in the future, how about the access, would it be as fast as array access?
@jordwalke the general policy of patches is that it should be sound -- which means even if it is missing somewhere, the output should be still correct (maybe less efficient or uglier). I think we can discuss more about it in the futre
@jordwalke the general policy of patches is that it should be sound -- which means even if it is missing somewhere, the output should be still correct (maybe less efficient or uglier). I think we can discuss more about it in the futre
I do not believe I proposed anything unsound.
It's almost the same performance on access, yes.
@bmeurer cool, it seems slow3.js is the best encoding at this time -- I used to learn that patch an object with property could cause de-optimization, but it is not true in this case, right?
@bobzhang In V8 this won't cause de-opts, but the assignment to x.tag will not be inlined into the function (but use a store IC instead), because the array literal doesn't reserve space for in-object properties, so we need to allocate out-of-object properties backing store for tag first.
@bmeurer @jordwalke so we will go with slow3
version in the short-term, in the future, we can provide a command line argument to allow different runtime encoding for different engines. Also any suggestion that can help give hints to VM engines tto optimize such code is much appreciated!
I want to address the issues with the object literals in V8. I'll try to reserve some time for that during the year.
@bmeurer cool, thank you in advance!
feel free to re-open it if anyone has better ideas
Is there still an issue with the performance of object literals?
I'm hitting an issue with the current array representation — I'm sending the representation of union types around the network, but given that most serialization formats either ignore or strip properties in arrays, there's no guarantee that the representation will be the same after being encoded and decoded again.
@ergl There's apparently some work on "efficient deriving" that should fix this. In the meantime, you could use this for serialization: https://github.com/BuckleTypes/transit-bsc
@glennsl thanks for the link, I ended up implementing a converter for array encoding <-> object literal encoding, as I'm not in control of the transport format right now
we are going to providing something like below: type t = ... [@@bs.deriving{json}] Currently it is recommended to roll your own
reply@reply.github.com At: 02/22/17 10:07:29" data-digest="From: reply@reply.github.com At: 02/22/17 10:07:29" style=""> From: reply@reply.github.com At: 02/22/17 10:07:29 To: bucklescript@noreply.github.com Cc: HONGBO ZHANG (BLOOMBERG/ 731 LEX), state_change@noreply.github.com Subject: Re: [bloomberg/bucklescript] Rethink about the runtime encoding of ocaml values in javascript (#24)
@glennsl thanks for the link, I ended up implementing a converter for array encoding <-> object literal encoding, as I'm not in control of the transport format right now
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
Hope this isn't too off topic. I was just looking at a common use-case: mapping over a list. I made a test demo using an implementation like List.map
(immutable updates) and Belt.List.map
(mutable updates). I tried with both an array and object representation.
I tested in Chrome and Safari using esbench. For the immutable case, the object wins out slightly in Chrome, and by quite a lot in Safari. For the mutable case, the object wins out by quite a lot in both browsers (both were faster than any immutable updates, too).
My results in Safari were,
Test | Ops/s |
---|---|
mapImmutableArray | 407,790 |
mapImmutableObject | 439,402 |
mapMutableArray | 822,669 |
mapMutableObject | 1,367,456 |
Goal:
Some documentation about the current encoding is [https://github.com/bloomberg/ocamlscript/blob/master/docs%2Fffi.md] here, there is a problem with this encoding is that
Pfield i
is no longer the same asParrayref
, while the internal OCaml compiler think it is the same, for examplestdlib/camlinternalOO.ml
,bytecomp/translobj.ml
,bytecomp/translclass.ml
there might be some other files I am missing. (Recent changes in the trunk ofstdlib/camlinternalOO
requires us to sync up the change)So I am thinking of that
Obj.tag
is not used in too much in the compiler itself(except the GC, which is not relevant in js backend)So I am proposing that
blocks with tag zero
is encoded as array, blocks with tag non zero (mostly normal variants) will be encoded as array plus an extra property viaObject.defineProperty
so that they are polymorphic. andPfield
,Parrayref
will behave the same.for example
A (1,2,3)
will be encoded asObject.defineProperty([1,2,3], {'t', {'value', 1}}
B(1,2)
will be encoded as[1,2]