interlockjs / interlock

JS bundler - inspired by Git, built on Babel.
MIT License
72 stars 7 forks source link

Resolve performance issues with multiprocess plugin #71

Closed divmain closed 8 years ago

divmain commented 8 years ago

Related: https://github.com/nodejs/node/issues/3145

Did some further digging to see where the bottlenecks were in our use case. Grabbed the AST for lodash.js (roughly 16MB JSON) and measured the full round-trip to pass over IPC to a child process and back. Also measured (de)serialization times using both JSON and `BSON.

The results:

Stringify took 2.004831594 seconds.
Parse took 0.681263825 seconds.
BSON serialize took 0.976388723 seconds.
BSON unserialize took 0.814068171 seconds.

Starting IPC...
Child received message.
IPC round-trip took 51.384239773 seconds.
divmain commented 8 years ago

Further measurements are warranted, especially of Babel's parse and generate, to determine whether the gains from parallelizing the tasks is worth the cost of (de)serialization. In particular, since the (de)serialization CPU burden is equally shared between the parent and child processes, we're looking at ~1s blocking operations that will interfere (to an unknown extent) with the parent's delegation of tasks to the worker pool.

tptee commented 8 years ago

Would something like https://github.com/dominictarr/JSONStream help? At least this way, parsing won't block and you can spread out the streaming over the parent's event loop.

andrasq commented 8 years ago

sending large ipc messages seems to be O(n^2) in the message length (see my comment in the related issue #3145). I've seen O(n^2) delays slip in during message reassembly, maybe worth a check.

divmain commented 8 years ago

Thanks @andrasq. I noticed the same thing. For now, I created a workaround using Unix sockets and BSON (de)serialization. Performance appears to scale linearly with the message size, and is likely good enough for our use case.

Stringify took 1.650871098 seconds.
Parse took 0.629973014 seconds.
BSON serialize took 0.962255679 seconds.
BSON unserialize took 0.775885246 seconds.

====== Starting Native IPC... ======
Child received message.
IPC took 50.578393824 seconds.

====== Starting BSON IPC... ======
Server bound on /tmp/interlock-ipc.sock...
Server has received connection from child process.
Child is connected to server.

Child received message.
BSON-IPC took 3.852710589 seconds.
divmain commented 8 years ago

Looks like the performance gains from farming out work to multiple processes is mostly cancelled out by all the serialization overhead. Going to close this for now, and pursue native extensions for performance improvements instead.