kentcdodds / match-sorter

Simple, expected, and deterministic best-match sorting of an array in JavaScript
https://npm.im/match-sorter
MIT License
3.73k stars 129 forks source link

feat: handle nested arrays with wildcard keys #120

Closed mart-jansink closed 3 years ago

mart-jansink commented 3 years ago

What: Handle nested arrays with one or more wildcard keys. As an example

const nestedObjList = [
  {aliases: [{name: {first: 'Janice'}},{name: {first: 'Jen'}}]},
  {aliases: [{name: {first: 'Fred'}},{name: {first: 'Frederic'}}]},
  {aliases: [{name: {first: 'George'}},{name: {first: 'Georgie'}}]},
]
matchSorter(nestedObjList, 'jen', {keys: ['aliases.*.name.first']})
// [{aliases: [{name: {first: 'Janice'}},{name: {first: 'Jen'}}]}]
matchSorter(nestedObjList, 'jen', {keys: ['aliases.name.first']})
// [{aliases: [{name: {first: 'Janice'}},{name: {first: 'Jen'}}]}]
matchSorter(nestedObjList, 'jen', {keys: ['aliases.0.name.first']})
// []

Why: Arrays like the above are relatively common in some applications that I'm working on. I really like the intuitive sorting of match-sorter so I'd love to be able to use this. However, using property callbacks for such a common case isn't ideal. So I thought I'd try making a pull request, hoping that this is something you might consider for merging.

How: The .reduce callback inside the getNestedValue function has gotten a lot more logic to possibly handle arrays. The calling getItemValues function now always returns an array, which simplifies the logic that returns the value a bit. Both the explicit 'aliases.*.name.first' and implicit 'aliases.name.first' wildcards are supported with this implementation; it was simpler to implement the latter while the former is clearer in use.

Checklist:

I sadly had some trouble running npm run test:update: it complains that not all branches are covered by the test cases. However, the lines that it says are uncovered are actually called, as far as I can tell, so I couldn't fix this..

mart-jansink commented 3 years ago

Thanks for your feedback. I understand the concern, but to me it seems that this library is already doing quite a bit of "magic" when it comes to sorting the results, and supporting nested keys in the first place. Of course that "magic" is well documented but so can this.

Since the applications I'm working on use the pattern of nested objects in arrays relatively often, writing similar property callbacks all the time isn't really DRY nor as performant as this can be. But if I'm the outlier in this, I'm more than happy to keep using a fork.

kentcdodds commented 3 years ago

Fair points. Luckily the added code isn't all that complex which is nice. I think this will be ok, but I'd like to see what this looks like documented first. Also, could you make sure we hit 100% test coverage with your new changes. Thanks!

codecov[bot] commented 3 years ago

Codecov Report

Merging #120 (d91842f) into master (4c2e927) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #120   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            1         1           
  Lines          137       163   +26     
  Branches        31        38    +7     
=========================================
+ Hits           137       163   +26     
Impacted Files Coverage Δ
src/index.ts 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4c2e927...d91842f. Read the comment docs.

mart-jansink commented 3 years ago

Thanks. The coverage is now back up to 100% again. And I've documented the changes more clearly.

While doing this I also found a few simple performance improvements that end up to matching about 20% faster. The biggest improvements were

I can't directly share the data that I used for benchmarking this, but I can look for something similar if you want?

mart-jansink commented 3 years ago

Yeah, I'll have another go at it.

mart-jansink commented 3 years ago

How do you feel about the use of .flat()? That could solve the problem but it isn't supported by IE, see https://caniuse.com/mdn-javascript_builtins_array_flat.

kentcdodds commented 3 years ago

I'm fine with flat. Folks should have standard polyfills by now

mart-jansink commented 3 years ago

OK, it now supports 'favorite.iceCream' again. The validation fails because node v10 doesn't have .flat() yet. We can also use a small polyfill like [].concat.apply([],values) instead?

kentcdodds commented 3 years ago

Let's use the small polyfill idea 👍

mart-jansink commented 3 years ago

There you go. I'm kinda stumped by the typing of the result, so I had to expect a few more errors than I'd like. Also, I've added a check to see if flattening is needed because it's quite expensive for the performance.

mart-jansink commented 3 years ago

That's perfect, it's much more readable indeed. I've never been a big fan of functional programming because it typically comes with a large performance penalty. But in this case I just stuck with how you had coded it.

Looking at the performance there was some degradation due to the use of for-of loops which are transpiled to support arbitrary iterators besides plain arrays. So I've refactored that once more using old-fashioned for loops, also including those in the getAllValuesToRank function.

My basic benchmark now gives the following results, comparing the last variations this branch to the master:

master x 409 ops/sec ±0.64% (86 runs sampled)
reduce x 448 ops/sec ±1.75% (83 runs sampled)
for-of x 424 ops/sec ±0.48% (85 runs sampled)
for    x 475 ops/sec ±0.34% (87 runs sampled)

So I'm perfectly happy with merging it like this.

On a side note: I can make pull request for a benchmark like this if you'd want? And, looking at my use-case of this module, I see room for more performance improvements by doing some memoization. My apps show 1000+ rows that are recursively processed multiple times while users type, or when just a few of those rows change due to server updates. Memoizing getAllValuesToRank, getHighestRanking and / or getMatchRanking would greatly reduce the amount of duplicate work that's done.

kentcdodds commented 3 years ago

@All-contributors please add @mart-jansink for code, tests, and docs

allcontributors[bot] commented 3 years ago

@kentcdodds

I've put up a pull request to add @mart-jansink! :tada:

kentcdodds commented 3 years ago

Thanks so much for your help! I've added you as a collaborator on the project. Please make sure that you review the other/MAINTAINING.md and CONTRIBUTING.md files (specifically the bit about the commit messages and the git hooks) and familiarize yourself with the code of conduct (we're using the contributor covenant). You might also want to watch the repo to be notified when someone files an issue/PR. Please continue to make PRs as you feel the need (you can make your branches directly on the repo rather than your fork if you want). Thanks! And welcome to the team :)

github-actions[bot] commented 3 years ago

:tada: This PR is included in version 6.1.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: