you-dont-need / You-Dont-Need-Loops

Avoid The One-off Problem, Infinite Loops, Statefulness and Hidden intent.
1.08k stars 53 forks source link

Updated unfold and range methods for significant performance improvement (see notes) #7

Open richardeschloss opened 4 years ago

richardeschloss commented 4 years ago

For section Corecursion


Performance test results on Chromium 73 (linux):

const unfoldOld = (f, seed) => {
  const go = (f, seed, acc) => {
    const res = f(seed);
    return res ? go(f, res[1], acc.concat([res[0]])) : acc; 
  }
  return go(f, seed, [])
}

const unfold = (f, seed) => {
  const next = (f, val, acc) => {
    if (!f(val)) return acc
    const [currVal, nextVal] = f(val);
    acc.push(currVal)
    return next(f, nextVal, acc); 
  }
  return next(f, seed, [])
}

const rangeOld = (start, end) =>
  unfoldOld(val => (val <= end) 
    ? [val, val + 1]
    : null
, start);

const range = (start, end) =>
  unfold(val => (val <= end) 
    ? [val, val + 1]
    : null
, start);

console.time('rangeOld')
const range_test1 = rangeOld(0, 5000)
console.timeEnd('rangeOld')

console.time('rangeNew')
const range_test2 = range(0, 5000)
console.timeEnd('rangeNew')
// Resp:
rangeOld: 500.856201171875ms
rangeNew: 28.38525390625ms
stevemao commented 4 years ago

You're sacrificing Correctness by construction and Ergonomics and maintainability for Runtime performance. I think the list showcases what you could do to improve the quality of code instead of showing the real implementation. In real world projects, I'd still write functional style code for business logics but put performance critical code in a separate module (with detailed comments why imperative code is used here). Similar to how we did with the workaround reduce.

This is definitely not perfect. If you're interested in a better solution, You can check out how Haskell does it.

richardeschloss commented 4 years ago

Oh ok

On Sun, Jan 5, 2020, 5:42 PM Steve Mao notifications@github.com wrote:

You're sacrificing Correctness by construction and Ergonomics and maintainability for Runtime performance. I think the list showcases what you could do to improve the quality of code instead of showing the real implementation. In real world projects, I'd still write functional style code for business logics but put performance critical code in a separate module (with detailed comments why imperative code is used here). Similar to how we did with the workaround reduce.

This is definitely not perfect. If you're interested in a better solution, You can check out how Haskell does it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/you-dont-need/You-Dont-Need-Loops/pull/7?email_source=notifications&email_token=ABNB7L67YYVGTOT7VD56OQLQ4J5BDA5CNFSM4KADJFR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIEEBQY#issuecomment-570966211, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNB7L7HVMEHHCOFQ7MBMQDQ4J5BDANCNFSM4KADJFRQ .

stevemao commented 4 years ago

A think a better solution might be to write a babel plugin. That way we can write maintainable code but generate performant code too. But you might want to just use Elm or Purescript then...

richardeschloss commented 4 years ago

Perhaps. It's an interesting thought

On Tue, Jan 7, 2020, 5:14 AM Steve Mao notifications@github.com wrote:

A think a better solution might be to write a babel plugin. That way we can write maintainable code but generate performant code too. But you might want to just use Elm or Purescript then...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/you-dont-need/You-Dont-Need-Loops/pull/7?email_source=notifications&email_token=ABNB7L3P7CGF2CSELKJO6KTQ4RW2ZA5CNFSM4KADJFR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIIVUSY#issuecomment-571562571, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNB7L3HL2UAA4AZWRHDXNDQ4RW2ZANCNFSM4KADJFRQ .

richardeschloss commented 4 years ago

Updated my PR #7 taking into account @auterium 's analysis. The original code passed in the values by reference, and I didn't take into account the performance impact my array destructuring would have.

JSFiddle shows the test results and improvement.

auterium commented 4 years ago

The key element to understand is why it's more performant, not just the fact that it is more performant.

The key perfromance killer in the old implementation is the use of Array::concat() as it creates a new array and inserts into it a copy of each element of the 2 original arrays into the new one. This is fine if you need to merge multiple arrays into a single one, but a terrible choice when merging any array with another one that only has 1 element, since it has to iterate through the whole array to just add 1 element more. This was very well documented in this article.

Here's a comparison of the original code (rangeOld), @richardeschloss proposal (rangeNew), and my hacky (let's be honest, acc.push(res[0]) && acc is an abuse of the language features) approach (rangeAlt):

const unfoldOriginal = (f, seed) => {
  const go = (f, seed, acc) => {
    const res = f(seed);
    return res ? go(f, res[1], acc.concat([res[0]])) : acc; 
  }
  return go(f, seed, [])
}

const unfold = (f, seed) => {
  const next = (f, val, acc) => {
    if (!f(val)) return acc
    const [currVal, nextVal] = f(val);
    acc.push(currVal)
    return next(f, nextVal, acc); 
  }
  return next(f, seed, [])
}

const unfoldAlt = (f, seed) => {
  const go = (f, seed, acc) => {
    const res = f(seed);
    return res ? go(f, res[1], acc.push(res[0]) && acc) : acc; 
  }
  return go(f, seed, [])
}

const rangeOld = (start, end) =>
  unfoldOriginal(val => (val <= end) 
    ? [val, val + 1]
    : null
, start);

const range = (start, end) =>
  unfold(val => (val <= end) 
    ? [val, val + 1]
    : null
, start);

const rangeAlt = (start, end) =>
  unfoldAlt(val => (val <= end) 
    ? [val, val + 1]
    : null
, start);

console.time('rangeOld')
const range_test1 = rangeOld(0, 5000)
console.timeEnd('rangeOld')

console.time('rangeNew')
const range_test2 = range(0, 5000)
console.timeEnd('rangeNew')

console.time('rangeAlt')
const range_test3 = rangeAlt(0, 5000)
console.timeEnd('rangeAlt')

The results in my machine:

rangeOld: 114.69189453125ms
rangeNew: 5.88720703125ms
rangeAlt: 1.1201171875ms

Now, there are other ways to achieve better results (even with better performance than loops without explicitly using loops):

const rangeFillMap = (start, end) =>
  Array(end + 1 - start)
    .fill()
    .map((v, i) => start + i)

const rangeLoop = (start, end) => {
  const acc = []
  for (let i = start; i <= end; i += 1) {
    acc.push(i)
  }
  return acc
}

console.time('rangeFillMap')
const range_test4 = rangeFillMap(0, 5000)
console.timeEnd('rangeFillMap')

console.time('rangeLoop')
const range_test5 = rangeLoop(0, 5000)
console.timeEnd('rangeLoop')

I had varying results, but in most of the cases using rangeFillMap was faster than using rangeLoop. I'm not sure why, but I think that Array(size) is pre-allocating the memory size, which much faster than having to dynamically increase it with Array::push(). Some results in my machine:

rangeFillMap: 0.222900390625ms
rangeLoop: 0.418701171875ms
richardeschloss commented 4 years ago

How about writing the unfold method as you have unfoldAlt, but slightly changing the ternary operation to:

const unfold = (f, seed) => {
  const go = (f, val, acc) => {
    const res = f(val)
    return (res) 
      ? (acc.push(res[0]), go(f, res[1], acc)) // This may be easier for newcomers to understand, no hack
      : acc
  }
  return go(f, seed, [])
}

I got varied results with performance, but ~75% of my test runs are faster with this small change. Attached chart shows 20 test runs and the results. ternary_comparison

stevemao commented 4 years ago

I think we could add a loop version as performant reference, similar to reduce

richardeschloss commented 4 years ago

I like that idea, and I think the Array fill map idea (rangeFillMap) is pretty neat, perhaps belongs in it's own section.

auterium commented 4 years ago

That's an interesting change on the ternary, I've never done it that way (didn't knew it was a possibility).

How about using https://jsperf.com/ to run the tests? This would be a more reliable/unbiased option to compare performances at large rather than doing it on our own machines

richardeschloss commented 4 years ago

Ok, here is the jsperf comparing RangeLoop vs. RangeFillMap. It appears that RangeLoop is still faster, but I like the readability of RangeFillMap more. (I think that still belongs in a separate PR). I've been trying seriously hard the past hour to make RangeFillMap faster, and I'm struggling to (and I really wanted it to be faster more than anyone :)).

richardeschloss commented 4 years ago

And jsperf for ternary ops snippet

auterium commented 4 years ago

I've updated the main snipped to include all the possibilities discussed here. The fill-map is the fastest option without a for-loop, but still 60% slower than using for.

Personally, I try to balance array iterators and for-loops depending on the use case, not just arbitrarily daemonize one or the other for personal preference or current hype. FP, as any form of programming, has it's time and place, which will depend on the requirements and the language. I'm a full-time backend-dev that is shifting towards systems programming and my day-job requires the outmost performance possible with big arrays, so for-loops and custom iterators are much more common than regular Array iterators, although they're still used in scenarios where we know the size will never be more than 10 or 15 elements, as it does improve readability.

I'd say the title of the repo should include (may not) and remove the "Loops are bullshit". Honestly, that statement shows off cockiness instead of objectivity. This repo provides a good case for using alternatives to loops, but it intentionally leaves out when it's a good idea to use loops

stevemao commented 4 years ago

Hi @auterium , thanks so much for the feedback. I understand where you're coming from. However, this article takes a very different philosophy and as I have stated before, we can't sacrifice Correctness by construction and Ergonomics and maintainability for Runtime performance. All of these three are important and languages like Haskell solves most of the problems (hence I put knowing Haskell is a prerequisite and in fact, it improves the performance in many cases). I'm not gonna put too much details on the technical side as there are already many online that explains way better than me. I know it's hard to solve them with JavaScript but at least we're trying to point out what's wrong with it.

auterium commented 4 years ago

Hi @stevemao I'm happy to help growing conciousness of how things work and why things are done the way they are done. I understand and mostly agree with your points of not sacrificing Correctnes by construction and Ergonomics and maintainability for Runtime performance, but where I disagree is in the way it's being presented. It can't be a hard rule that you must always sacrifice performance in favor of ergonomics, but htat you'll normally do so.

Haskell is designed from the ground up to be highly performant in FP, while JS is not, so trying to port Haskell approaches to JS is not a straightforward thing to do. You can't just copy-paste-then-translate Haskell code to JS. Even if it works, there are a lot of things that internally work different and that's why you see such performance penalties.

My point is that this article should be oriented to invite the readers to understand when it's OK to sacrifice performance (with the minimum penalty possible; this should be +95% of the cases) and when it's OK to sacrifice ergonomics. Imperatively stating that "loops are the root of all evil; every time you use them a kitty dies" alienates people that have even a slight shade of doubt of your statement (you're already putting them in a defensive, closed position) while being more persuasive, appealing to the sense of curiosity will drive them to read more in depth and understand the reasoning behind the statement.

As you saw on the benchmarks, a very tiny change on your code had a 100-200x performance increase, and no ergonomics, correctness nor maintainability was affected. And even if you don't like the (acc.push(res[0]), go(f, res[1], acc)) nor the acc.push(res[0]) && acc, you can always add a polyfill:

Array.prototype.pushAndReturn = function pushAndReturn(element) {
    this.push(element)
    return this
}

And then use it instead of Array::concat():

const unfoldOriginal = (f, seed) => {
  const go = (f, seed, acc) => {
    const res = f(seed);
    return res ? go(f, res[1], acc.pushAndReturn(res[0])) : acc; 
  }
  return go(f, seed, [])
}
stevemao commented 4 years ago

That sounds good @auterium. Really appreciate the input :) PR is always welcome. I think as long as we comment why such sacrifice is made that should be good. I also heard people confused about this and classes being used too would be good to add more comments.

richardeschloss commented 4 years ago

I just added two new tests here (7) and came across some interesting results that surprised me. Extending on the idea of using an Array polyfill...it seems like there doesn't really need to be a sacrifice of anything. It seems like the best of both worlds can be achieved: maintainability and performance with the following code snippet:

polyfills/array.js -- might be where such code would be stored

// Basic idea is: stuff the for loop in a range function and test it...and once it's been tested, 
// the one-off errors that are associated with loops become less of a concern...just re-use the 
// range function defined here: (no need to define an additional unfold method; no need to 
// keep track of other dependencies)
Array.prototype.range = function(cb) {
  // May want to warn users of the Gorilla patching:
  // console.warn('Gorilla patching array with "range" method in polyfills/arrays')
  const out = []
  if (this.length < 2 || this[0] > this[1]) return out;
  for (let i = this[0]; i < this[1]; i++) {
    out.push(cb ? cb(i) : i)
  }
  return out
}

// Example usage:
const arr = [0, 5000]
arr.range() // returns an array from 0 to 5000
arr.range( i => i * 2 ) // if a callback is provided, act on the element provided

In my ideal world, the range functions would sort of be similar to the way I write the range in mathematics:

[ start, end ].range( handleItem )

And, if the polyfill becomes popular enough, it may just end up as a JS built-in some day in the future.

The results: RangePolyFill is actually the fastest! Faster than the for loop on it's own. As expected, RangePolyFillCb is a bit slower, but what I find interesting is that it doesn't encounter stack overflow, even when I extend the range beyond 1 million. I imagine the performance would improve if it were a built-in; my guess is it would be comparable to RangeFillMap.