Functional style - Githubissues

This is a substantial refactoring of the Underscore source code, which I've been working on since shortly after the modularization (#2849). Before merging, I would like this to be reviewed extensively, by multiple people, in multiple ways and in multiple stages. Please see the final third of this post for review details.

Edit: I need help from people who can contribute benchmarks based on real-world applications. Please leave a comment if you think you might be able to help!

Goals

Over the years, the source code has lost a substantial portion of its former elegance. This is partly due to bugfixes and feature additions, but also substantially due to micro-optimizations of which the added value is often questionable. I wanted to bring back some of that former elegance, prioritizing a clean, readable, DRY code style over absolute maximum performance.
Making the code more DRY and more functional would also contribute to a smaller bundle size (#2060), so I made this an explicit goal as well.
Preparing the source code for future code sharing with Underscore-contrib and Underscore Fusion. I theorized that some of Underscore's internal logic could be factored out. Once factored out, it could potentially be made public, or at least be made stable enough to enable reuse in Underscore's sister projects.
Preparing the source code for adding new core collection types in the future, in particular Map, Set and iterators in Underscore 2.0. This can only be done in a maintainable way if there is a single source of truth about how to iterate any given collection. More about this below.

Principles

No interface changes (yet), in order to keep distractions at a minimum. While the code changes contain some seeds for possible future interface changes (mostly additions), I wanted this branch to focus entirely on the implementation of the existing interface.
No assumptions. I measured the effect on minified+gzipped UMD bundle weight after every change and did extensive microbenchmarking in order to assess impact on performance.
Only meaningful changes. If, after measurement, it turns out that a change benefits neither readability nor bundle size nor performance, don't make the change. There are quite a few changes that I reverted for this reason. I have interactively rebased this branch several times in order to clean up history.
Mild performance degradations are acceptable. 99% of application code is not performance-critical; programmer productivity and code size are usually much more important. Neither of those interests is served by trying to squeeze every drop of performance out of the hardware. In the remaining 1% of cases, if a slight slowdown is really a problem (i.e., fairly trivial code in a hot loop), users can replace their call to an Underscore function by a simple hand-written loop and easily achieve much better performance than they ever could with a library function.

Approach

I started by asking myself which existing Underscore function should be the single source of truth about collection iteration. Initially, I considered _.reduce. It is a very general function and many other collection functions can be cleanly expressed in terms of it, for example _.map, _.filter and _.min/_.max. I also considered _.each, because it doesn't use an accumulator by default. _.each and _.reduce are equally powerful, in the sense that either can be cleanly expressed in terms of the other.

I soon realized, however, that there is a function that can't be cleanly expressed in terms of _.each or _.reduce: _.find. Like the procedural for-in loop with a break option, _.find may stop iteration early. Otherwise, it basically does the same thing as _.each. All Underscore collection functions can be cleanly expressed using _.find, including not only _.each, _.reduce, _.map and _.min but also _.some, _.every and _.contains. For this reason, I have chosen _.find to be the one function that defines how to iterate a collection.

Conveniently, _.find was already implemented by branching over the collection types: it calls _.findIndex on arrays and _.findKey on objects. These latter functions, in turn, are the single sources of truth about iterating arrays and objects, respectively (although the situation is a bit more nuanced with regard to arrays; more on this shortly). In Underscore 2.0, I plan to add _.findEntry for Map/Set and _.findIteration for iterators. By including these in _.find and implementing all other collection functions in terms of _.find, all collection functions would automatically support all five collection types in a consistent way.

Besides using _.find in all collection functions, one of the first things I did was factoring out very general linearSearch and binarySearch functions. These are the actual single sources of truth on how to iterate/search arrays; _.findIndex, _.findLastIndex, _.indexOf, _.lastIndexOf and _.sortedIndex are all implemented using linearSearch and/or binarySearch under the hood.

linearSearch is so general that I was able to substitute it for nearly all hand-written for/while loops in the source code. This proved very effective in both producing cleaner code and reducing the bundle size. However, I have reverted this in many places because function call overhead turned out to be costlier than I expected. It seems that even modern JS engines rarely perform inlining, if ever. After discovering this, I adopted the following "safe" rule to choose between linearSearch and a hand-written loop: if the loop body didn't previously involve any function call, keep the hand-written loop; otherwise, replace it by linearSearch or another Underscore function. In this way, the extra function call introduced by using an Underscore function can never slow down the loop by more than a factor two. I expect the slowdown to be less in real-world scenarios, because a loop body that purely consists of two function calls (i.e., with functions that don't do anything) is trivial. Also, loops that already involved a function call often already involved multiple function calls.

I wrote an extensive microbenchmark suite to measure performance, in which each operation is repeated at least 3.6 million times and during at least a whole second. Please follow that link for details. All performance claims in the current post are informed by repeated measurements in Node.js 10, mobile Safari 14 and desktop Firefox 84 (on macOS) with those microbenchmarks. However, I believe that a final performance impact assessment should be made by comparing execution time in real-world applications and across a wider range of engines. More on this in the review section at the bottom of this post.

With this overall approach in mind, I made several passes over all the source modules, looking for opportunities to make the code cleaner, DRYer and more functional. The code I'm presenting today is the result of this effort and the subject of the first round of reviews. Once I have received and processed the first round of reviews, I plan to make one more pass over the source code, in order to make function parameter names more consistent and to fix the linear order of the bundles. After that, I'll take one smaller, final review before finally merging the branch.

Results (stage 1)

All collection functions except for _.find are now written in a collection-agnostic way, i.e., they use _.find or another Underscore function to abstract over the collection type.
New generic functions: linearSearch, binarySearch, extremum, greater, less, lessEqual. I believe that linearSearch and binarySearch should remain internal, in favor of the more user-friendly _.*index* family of functions (which can however expand parameters to support more options from the underlying primitives); the others could potentially be made public in a future update.
It is trivial to add _.sortedLastIndex in the future (in fact it's implemented but not added to the index.js yet) as well as an isSorted option for _.lastIndexOf (which is manually disabled for now).
Hand-written loops have in many cases been replaced by a call to linearSearch or another Underscore function. Besides linearSearch, the main loop primitives are _.findKey and _.times. Besides those primitives, 20 functions (out of the original 38) still contain hand-written loops for performance reasons.
While factoring out extremum, I fixed #2688 and opened the possibility of making _.min and _.max work with non-numbers in the future. This is also how I uncovered the "crazy comparison semantic" that I discuss in the next bullet.
I was able to deduplicate the slow path of _.min and _.max, which transforms the elements through an iteratee before comparing, through the outfactoring of extremum. I couldn't deduplicate the fast path (without iteratee) yet, because, for historical reasons, they currently have a rather crazy comparison semantic; they do something that is wildly different from just the builtin </> operators. I could abstract this semantic in a function and pass it to extremum, and in fact, this is exactly what I do in the slow path, but the fast path would suffer a performance degradation of roughly a factor 25 if I were to do the same there. Alternatively, I could inline the comparison logic in extremum, but I don't want to do that because it is really crazy. I want to change it back to regular </> in Underscore 2.0, in which case I'll be able to move the fast path to extremum as well and finish the deduplication. Please see my pre-review comments after this post for more details.
The "switch pyramids of doom" (as @jashkenas once aptly called them) in _.restArguments and optimizeCb are gone. As far as my benchmarks could tell, they had no performance benefit whatsoever over the simpler code that I put in their place.
The infrastructure around _.iteratee and the internal cb has been simplified substantially.
Some functions, including _.map and _.each, take a slight performance hit due to the changes. In my microbenchmarks, with very trivial loop bodies, this could be up to a factor two. I consider this unavoidable, due to the need for a single source of truth about collection iteration, though I have tried hard to keep the sacrifices at a minimum. I'd like to see real-world performance comparisons before drawing any conclusions.
In cases where really tiny collections with only one or two elements are passed to an Underscore function, performance hits might be worse than a factor two due to the increased stacking of function calls. Again, I would prefer to benchmark real-world applications before drawing any conclusions.
In contrast, some functions actually seem to have sped up a bit due to the simplifications, for example the anonymous functions returned by _.restArguments, which are used internally throughout Underscore.
The minified+gzipped UMD bundle size shrank by 225 bytes. I think this is as close to resolving #2060 as we will ever get (edit: other than some breaking changes that can be made in version 2.0, such as removing IE<11 support).

Review

This PR is unusual because it may lead to changes that sacrifice performance in favor of other considerations. For the sake of accountability, I'd like the review to be very thorough and entirely public. This is going to be a lot of work and I don't want all that work to land on the shoulders of only one or two people. Instead, I hope to involve many people, each doing only light work. As a token of appreciation, everyone making a substantial contribution to this review will get a permanent honorable mention in the changelog.

I propose to divide the review work as follows. I'll be inviting specific people for specific parts, but please rest assured that anyone is welcome to participate in any part of the review. I'll keep track of review progress below.

[ ] Stage 1 (before final polishing)
- [x] High-level review of the goals, principles, approach and results described in this post. I'd like to specifically invite @jashkenas for this part, because he is the original creator of Underscore.
- [ ] Review of the code changes in modules/, preferably by at least two independent expert Underscore users. I would like to specifically invite @cambecc, @reubenrybnik and @joshuacc for this part. Please spread the word if you know other expert Underscore users who might be willing to contribute a code review.
  - [ ] Expert code review 1
  - [ ] Expert code review 2
- [ ] Preparation of benchmarks based on real-world applications/libraries. I'd like to spread this work over at least three people, each preparing one application; one for Node.js, one for the browser and one more that can be for any environment. "Exotic" environments such as Electron and ExtendScript are also welcome. Please see the section "Benchmark preparation" below for details. Again, please spread the word if you know someone who might have a suitable application or library!
  - [x] Benchmark 1 (Node.js and also browser)
  - [ ] Benchmark 2 (browser)
  - [ ] Benchmark 3 (any environment)
- [ ] Performance comparisons between Underscore at commit c9b4b63 (master) and Underscore at commit eaba5b5 (functional), one for each applicable combination of a benchmark as listed above and an engine as listed below. This can be spread over many people. Please spread the word if you know anyone who might be willing to help with the measurements. See the "performance measurements" section below for details.
  - [ ] Node.js 10
  - [ ] Node.js 14
  - [ ] Desktop Chrome 80+
  - [ ] Desktop Firefox 80+
  - [ ] Mobile Safari 13+
  - [ ] Internet Explorer 11
  - [ ] Other (optional)
[ ] Stage 2 (after processing the outcomes of stage 1 and polishing)
- [ ] Esthetic review of the generated bundle. I'd like to specifically invite @jashkenas again for this part, when the time is there.

Benchmark preparation

If you have a real-world JavaScript application (or library function) that meets all of the following criteria:

already in production, i.e., deployed/released and in use by real people in order to accomplish a real task;
has a performance-critical section, i.e., a part that required some attention in the past in order to ensure that it would be fast enough, or of which you could easily imagine that it might become problematically slow;
the performance-critical section involves calling at least one Underscore function;
the application can be adjusted to focus on testing the speed of that performance-critical section (so that testers don't need to wait for more than a few seconds before they can start measuring);

and you are willing to do the following things, with help as needed:

to create a version of your application that is suitable for quickly measuring the speed of the performance-critical section (without changing anything about the internal logic of that section itself);
to publish the source code of the benchmark program thus created and to keep the source code open for an extended period of time (at least one year but preferably indefinitely);
to invest some additional effort to make it as easy as possible for testers to run the program, to measure the speed and to switch between both versions of Underscore;

then please let us know by leaving a comment below!

Performance measurements

You can contribute performance measurements by doing the following.

Pick an engine that you can run benchmarks on (such as Firefox) and a benchmark program that can run on this engine.
Run the benchmark on the chosen engine, both with Underscore at commit c9b4b63 (master) and Underscore at commit eaba5b5 (functional), and compare the speed.
If you find a small speed difference, please repeat your measurements as often as is required in order to determine whether the difference is significant.
If you do find a significant difference, please profile the benchmark with both versions of Underscore to determine what change in Underscore is causing the difference.
Report your findings by leaving a comment below. Please include details such as hardware specifications, exact versions of the operating system and engine used, number of repeated measurements, exact measured speed on each repetition, and exact numbers obtained from profiling.

Final words

I make major contributions to Underscore, like the current one, because I believe it is an excellent library and because I want to help lift it to the next level. If you like what I'm doing, please consider supporting my open source work on Patreon. Shoutout to my existing backers, who encouraged me to do this work, and to @jashkenas, who created this awesome library.

jashkenas / underscore

Functional style #2908