Closed philippe56 closed 7 years ago
It seems that attached file is missing
In this program, the problem comes from the use of collect(), which is not scalable, and meant (like parallelize) for small amounts of data and for test purpose. Here, dozen of megabytes of data are going in and out at once through the RPC protocol instead of a data transfer protocol. An efficient data transfer protocol exists in skale, but only for input streams (see linestream/objectstream sources) and shuffle operations. It's not available yet to output from skale to external world. We're working on it.
Now, by default nodeJS runtime limits itself to something around 1 GB of heap size, which explains why it fails even if more RAM is available in the system. You can increase memory space using
$ node --max_old_space_size=8192 <program> <args> ...
Here is a screen copy:
$ skale run
<--- Last few GCs --->
83075 ms: Scavenge 1407.1 (1444.7) -> 1406.1 (1444.7) MB, 0.4 / 0 ms (+ 1.0 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep]. 83084 ms: Mark-sweep 1406.1 (1444.7) -> 1402.7 (1441.9) MB, 8.1 / 0 ms (+ 1.0 ms in 1 steps since start of marking, biggest step 1.0 ms) [last resort gc]. 83089 ms: Mark-sweep 1402.7 (1441.9) -> 1402.5 (1441.9) MB, 5.9 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x1a8657eb4629
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Although this programs works correctly with 1 worker, it is inherently not designed to scale as soon as work is dispatched to 2 or more workers, due to the iterative use of cartesian (24 stages of cartesian applied to previous). Too much of a corner case. An interesting pathological example, but I consider it outside of scope of skale-engine, so closing it.
The attached file fails with 'process out of memory', after 1.5Gb have been allocated.