josdejong / mathjs

An extensive math library for JavaScript and Node.js
https://mathjs.org
Apache License 2.0
14.42k stars 1.24k forks source link

Performance issues in `deepMap` #3262

Open josdejong opened 2 months ago

josdejong commented 2 months ago

After extending the benchmark file /test/benchmark/map.js, I was triggered by an outlier abs(array) which is more than 10x faster than the rest:

# develop

abs(genericMatrix)                         x 54,106 ops/sec ±0.59% (96 runs sampled)
abs(array)                                 x 620,327 ops/sec ±0.51% (95 runs sampled) <-- whuuut?!
abs(numberMatrix)                          x 48,865 ops/sec ±1.48% (92 runs sampled)
genericMatrix.map(abs)                     x 21,191 ops/sec ±0.62% (95 runs sampled)
numberMatrix.map(abs)                      x 19,951 ops/sec ±2.38% (94 runs sampled)
map(genericMatrix, abs)                    x 20,971 ops/sec ±0.56% (93 runs sampled)
map(numberMatrix, abs)                     x 20,036 ops/sec ±0.82% (93 runs sampled)
map(array, abs)                            x 21,609 ops/sec ±2.15% (93 runs sampled)
map(array, abs.signatures.number)          x 20,754 ops/sec ±2.60% (90 runs sampled)
genericMatrix.map(abs.signatures.number)   x 50,100 ops/sec ±1.08% (92 runs sampled)
numberMatrix.map(abs.signatures.number)    x 45,769 ops/sec ±0.94% (92 runs sampled)

The relevant code is the internal function deepMap and the method DenseMatrix.map.

I think there is a bug in deepMap when executing on a Matrix: DenseMatrix already iterates over all items itself, so in that case there is no need to call deepMap recursively. This doesn't do much harm but it is more neat to change this to just call array.map(callback) in case of a Matrix, and only recurse in case of an Array.

After a little experiment I saw that the cause of the performance drop in DenseMatrix.map is caused by creating and passing an index array to every callback invocation. I think we can optimize this by checking if the callback actually uses index, and if not, do not provide it. I think we need to introduce an recurseNoIndex which is used when callback.length === 1.

I think we can best look into this after #3256 is merged since that PR refactors the functions applyCallback and recurse that needs futher changes. We also need to see how #3251 works out, since that also improves the .map function.

@dvd101x what do you think?

kgryte commented 2 months ago

Is callback.length a reliable indicator here? If this is only used with callbacks defined internally, then may be possible to get away with such a check, but, if users can provide callbacks, then they can also use arguments to access the index argument and relying on function arity may lead you astray.

josdejong commented 2 months ago

Good point. I think we can either pass a flag internally to tell whether or not indices are needed, or we could check something like callback.toString().includes("arguments") (a conservative check).

dvd101x commented 2 months ago

Hi Jos and Athan,

The implementation of deepMap is a bit different than other implementations, I wouldn't think it affects speed that much, but I will check if changing if (array && (typeof array.map === 'function')) to something else has some benefits.

I will do some tests and report back, maybe it's a simpler fix that could be implemented before #3256 or is something that can be fixed within #3256. It might take a few days.

Maybe unrelated

Reading the comments it seems that the option to skip zeros in deepMap in some cases doesn't work.

https://github.com/josdejong/mathjs/blob/0f87a7b6c3456166cf355d8fc768a8fc9063847e/test/unit-tests/type/matrix/utils/deepMap.test.js#L56-L67

Currently there is some duplicated code with the various recurse functions and deepMap maybe when the performance issue is found, these different methods of iteration could be merged in a common place.

dvd101x commented 2 months ago

Hi Jos and Athan,

You were right on the money!

By doing both; fixing deepMap so that it doesn't later do DenseMatrix.map and adapting the recurse function so that it doesn't pass an index and an array regardless of the callback, some nice improvements were found.

abs(genericMatrix)                         x 107,296 ops/sec ±0.09% (99 runs sampled)
abs(array)                                 x 290,448 ops/sec ±0.46% (100 runs sampled)
abs(numberMatrix)                          x 85,916 ops/sec ±0.06% (100 runs sampled)

As a summary now abs(genericMatrix) is about 4 x faster than what it was, in relation to abs(array).

I made the following branch as a proof of concept. https://github.com/josdejong/mathjs/tree/deepMap-and-recurse-performance

Please review as I think there are cleaner ways to know if the callback is called with 1, 2 or 3 arguments.

Jos, now I agree with you that this should be fixed after #3256 due to the refactors of applyCallback and recurse and the different nature of the change.

Appendix

I did some tests in regular javascript and it seems that for regular functions (not typed) it doesn't affect that much in performance whether the callback is callback(val), callback(val, idx) or callback(val, idx, array).

dvd101x commented 2 months ago

After some tests, I think I found another opportunity to improve on performance besides the previously discussed.

Seeing how fast is abs(array) I think maybe recursion is not the biggest issue, at this point I'm not sure but I would like to work on this issue and make a PR.

Proof of concept

For typed functions it would be nice to save the previously found signature implementation and try to use that first.

name mean std
abs 0.349 0.1307
map typed 6.926 0.1606
map func 1.694 0.0679
newMap typed 0.269 0.0884
neMap func 0.163 0.063
// Generate a matrix with random values between -1 and 1
const matrix = math.random([100, 100], -1, 1)

// Define the tests to be performed
const testCases = [
    { name: 'abs', run: () => math.abs(matrix) },
    { name: 'map typed', run: () => math.map(matrix, math.abs) },
    { name: 'map func', run: () => math.map(matrix, Math.abs) },
    { name: 'newMap typed', run: () => newMap(matrix, math.abs) },
    { name: 'newMap func', run: () => newMap(matrix, Math.abs) }
]

// Execute the tests and store the results
const testResults = testCases.map(testCase => {
    const result = runTest(testCase.name, testCase.run)
    return {
        name: result.name,
        mean: Number(result.mean.toFixed(4)),
        std: Number(result.std.toFixed(4))
    }
})

// Display the results in a table
console.table(testResults)

/**
 * Applies a callback function to each element of a matrix.
 * @param {Array} matrix - The input matrix.
 * @param {Function} callback - The callback function to apply.
 * @returns {Array} - The resulting matrix.
 */
function newMap(matrix, callback) {
    if (math.typed.isTypedFunction(callback)) {
        let implementation = () => { throw new Error('null implementation') }
        return recurseTyped(matrix)
        function recurseTyped(array) {
            if (Array.isArray(array)) return array.map(element => recurseTyped(element))
            try {
                return implementation(array)
            } catch (error) {
                implementation = math.typed.resolve(callback, [array]).implementation
                return implementation(array)
            }
        }
    }
    else {
        return recurse(matrix)
        function recurse(array) {
            if (Array.isArray(array)) return array.map(element => recurse(element))
            return callback(array)
        }
    }
}

/**
 * Runs a performance test for a given function.
 * @param {string} testName - The name of the test.
 * @param {Function} testFunction - The function to test.
 * @returns {Object} - The test results.
 */
function runTest(testName, testFunction) {
    const totalTrials = 100
    const warmupTrials = 5
    let executionTimes = []

    for (let i = 0; i < totalTrials + warmupTrials; i++) {
        const startTime = performance.now()
        testFunction()
        const endTime = performance.now()
        executionTimes.push(endTime - startTime)
    }

    // Exclude warmup times
    const validTimes = executionTimes.slice(warmupTrials)
    return {
        name: testName,
        mean: math.mean(validTimes),
        std: math.std(validTimes)
    }
}
dvd101x commented 2 months ago

Hi,

I found some progress regarding speed.

name Master [ops/sec] Fix DeepMap [ops/sec]
abs(genericMatrix) 33,514 110,937
abs(array) 292,829 297,497
abs(numberMatrix) 31,285 88,499
genericMatrix.map(abs) 11,986 56,688
numberMatrix.map(abs) 11,601 50,054
map(genericMatrix, abs) 11,864 56,409
map(numberMatrix, abs) 11,844 50,208
map(array, abs) 12,278 85,195
map(array, abs.signatures.number) 13,237 85,204
genericMatrix.map(abs.signatures.number) 32,086 105,399
numberMatrix.map(abs.signatures.number) 25,089 85,757

I will try to make a PR with some improvements on the deepMap and other recursion functions in the following week.

I have this running benchmark test that I'm using as a proof of concept. Works best on chromium due to performance.now() with sub-millisecond resolution. https://dvd101x.github.io/mathjs-fast-map/

dvd101x commented 2 months ago

Hi, the main idea is mostly done at. https://github.com/josdejong/mathjs/tree/deepMap-2 https://github.com/josdejong/mathjs/tree/deepMap-and-recurse-performance

As a summary you were right from the beginning since using the right amount of arguments specially when no index is needed, speeds up not only deepMap with matrix, but everything.

When a second argument index is included, a faster method was found. This could be applied in a separate PR.

The current develop branch, includes only one recurse algorithm that resembles map, if a separate methods are applied for recursion like map and forEach is made then also there is a speed improvement in forEach and deepForEach. This also could be applied in a separate PR.

Theoretically map and forEach could be faster than the current abs(array) according to https://github.com/dvd101x/mathjs-fast-map

callback value callback valueIndex

I would like to know your opinions before making a PR.

dvd101x commented 2 months ago

Sorry for the frequent comments, I had some spare time and found the issue very intriguing.

I made PR #3266

I will chill for a bit.

josdejong commented 2 months ago

Awesome 😎

dvd101x commented 1 month ago

Thanks to #3251 this issue is reduced significantly. As you previously mentioned, now it's a matter of implementing it not only in matrices but also arrays and collections.

Even though the attempts to eliminate bottlenecks in recursion functions gave some nice theoretical results, by implementing them they didn't pan out as well as #3251.