Closed yurivict closed 3 years ago
Are you sure that the time isn't being taken up in the string concatenation loop? That has to create tons of temporary strings.
Are you sure that the time isn't being taken up in the string concatenation loop?
Yes, it prints the string size immediately.
Ah, i didn't notice that you are doubling the size of the array on each loop iteration - i thought you were just adding one entry per iteration. (Now that 115s makes more sense!)
FWIW, i seem to remember reading somewhere that Google v8's array class does not support more than something like 100k entries (it might have even been 30k - it was nearly 10 years ago in the v8 Google Group). Creating an array with a quarter of a million(?) entries (N=18) seems somewhat excessive/unrealistic.
It is realistic, some computations produce this much data for just one reasonably sized image.
i dare say that JS is not an appropriate language for such calculations. But to each his own.
i dare say that JS is not an appropriate language for such calculations. But to each his own.
There are corner cases, you can't switch to C++ just because you hit one huge array once a month. Besides, it's obviously a bug in MuJS, so it shouldn't even be a limitation in this case.
This also means that similar operations with smaller arrays are also slower than they should be, it's just not as visible because they are smaller.
To be pedantic: this is a performance limitation, not a bug - the software performs the expected operation with the expected result, it just does so more slowly than you would like. That's not to say it can't be improved, but characterizing it as a bug does its developer a disservice.
You can call it anything, but when p in Time =~ N^p
is higher than it is normally expected (1 vs 2 in this case) some jobs would practically just not finish at all when they could finish in a reasonable time otherwise.
I'm afraid that parsing a 2Mb script consisting of a single array literal is beyond the scope of mujs's normal use case. Even C compilers struggle with that kind of code. With the current (16-bit bytecode) instruction scheme, even at N=18 the 512Kb eval script runs out of instruction space. It would possibly work much faster and have fewer problems if you rewrote the script to create the array directly instead of creating a huge eval script with the array literal.
rewrote the script to create the array directly instead of creating a huge eval script
My guess is that it's a test case to demonstrate the issue, while his real issue is with external code generating big arrays of data - which he can't generate from his js code.
At such case though, maybe the solution is to read the data (stdin or a file) from the script and parse/process it to add values to a js array. Not sure how much the mujs binary is equipped for such task though, so you might need edit it or write your own libmujs wrapper which allows it (e.g. expose the C fgets
or some such to the js script).
The main problem is non-linearity. The data grows linearly but the parser probably uses some double loop somewhere.
The point isn't a huge array with 2MB of data. The problem is that due to this problem parser loses CPU cycles even on smaller arrays, just not as noticeable. This array just demonstrates the problem acutely.
The non-linearity only happens with arrays larger than 32768 elements -- because of how the array initialization bytecode works. Each array entry generates three instructions (or more depending on the value) -- push_integer, push_value, set_property. The push_integer opcodes need to use a table for literal numbers that are larger than will fit in a signed 16-bit integer, and the lookup in this table is what's causing non-linearity.
I'll look into using another scheme for generating array initialization code that doesn't need to rely on the table of number literals.
Fixed with 06a6f9fb110fe02eb5c306a9c6c143a6cd8bd3bb and future proofed (removed the linearly scanned constant tables altogether) with 33ffe6efebf477ee1f73c27670af460ffd6cc08d.
This program
finishes in 23sec with
N=18
, in 115s withN=19
and in >500s withN=20
, despite the array size only increasing 2 times with each increment ofN
.It seems like it should be reasonable to expect the time to grow linearly because adding elements to the array's tail while scanning the string is all that is required here. There should be no need to perform any operations depending on the array length in each step,