YuriGor / deepdash

eachDeep, filterDeep, findDeep, someDeep, omitDeep, pickDeep, keysDeep etc.. Tree traversal library written in Underscore/Lodash fashion
https://deepdash.io/
MIT License
275 stars 12 forks source link

getIterate.js performance issues #33

Closed raz-sinay closed 5 years ago

raz-sinay commented 5 years ago

I've encountered performance issues in my code (which uses filterDeep) and after some research using Chrome DevTools performance tab, it seems that the getIterate.js file causes performance issues.

Here are 2 images displaying the results of the performance test recordings: image

image

I understand that there must be a deep scan to search for a path which might be relevant for the user (was specified in childrenPath). but maybe one of these suggestions can help:

  1. Add a maxScanDepth (number) option, which will limit the depth of the search for cases where the user knows his deep object's structure and doesn't want to scan all the paths in the scanned object.
  2. Add a isExactChildrenPath (boolean) for cases where only the paths specified by the user in the childrenPath parameter are relevant and there is no need to search other paths (Ofcourse he must supply the full path).
  3. As mentioned in this stackoverflow question and was tested here it is possible to use spread instead of concat for better performance. like this: currentParents = [...parents, currentObj]; instead of currentParents = parents.concat(currentObj); or childPath = [...path, childKey]; instead of childPath = (path || []).concat([childKey]);
YuriGor commented 5 years ago

Hi! I'll think about 1. - sounds good

  1. is already implemented, take a look at childrenPath option
  2. I'll take a look, thank you for idea. Hope transpiler will generate clean enough code for spread.
raz-sinay commented 5 years ago

Thanks for the quick response! About suggestion 2, the childrenPath specifies what paths will be sent to the user's callbacks, but the iteration process scans all paths, while it can just scan by the paths the user specified (as far as I've seen, if I'm wrong please tell me how to use it properly to reduce scan time)

YuriGor commented 5 years ago

If childrenPath is specified - deepdash doesn't visit other paths. If you see it's not so in some cases - let me know, it's a bug. At getIterate.js look at :97 (children only case) and :105 (full scan case)

raz-sinay commented 5 years ago

the thing is, filterDeep uses condenseDeep (if the condense flag wasn't set to false) , which uses eachDeep. (filterDeep ==> condenseDeep ==> eachDeep) when condenseDeep calls eachDeep, it doesn't supply the entire options object that the user specified in the filterDeep call, but only part of it : var eachDeepOptions = { checkCircular: options.checkCircular, }; this causes a full scan.

moreover, I've replaced the concat with spread, which did result in a big performance improvement. as can be seen here: image I forked the repository to submit a pull request, but unfortunately, as you said, the transpiler turned it back to concat. Are there configurations that can be made? I've never used this transpiler before.

YuriGor commented 5 years ago

Maybe it's a good idea to extend options to let user collapse only holes caused by filtration, and keep other empty slots untouched.

It's a pretty complicated way - I will need to remember paths rejected by a filtration process, check each level in the path if it's the beginning or middle of some array and, instead of total deep condensing, surgically collapse specific holes, considering each array could have other holes originally or left by another rejected path.

In case some array has multiple holes left by several rejected paths - it will be a pain to recalculate shifted indexes on each collapsed hole, to keep rest of remembered rejected paths correct.

As a dirty alternative - I can remember not a paths rejected by a filter, but only arrays affected by such rejection. And after filtering process is done - just condense all these arrays. In this case, some originally existing holes will be collapsed too, but some - will not. So this case is good only for an initially condensed object. Yeah, I think it will be good enough to give the user a choice:

About spread / concat - there is a separate deepdash-es module - it's not transpiled. Please don't hurry with a PR for now - there are upcoming changes in the lib build process I want to finalize first.

raz-sinay commented 5 years ago

Actually, after I've seen the performance results of using spread instead of concat, maybe adding the other 2 options that we have discussed are overhead, because the improvement is quite big by itself. I do think the change in deepdash-es is important and it's really quick with no side effects. I made the PR there , under es/private/getIterate.js. (hope that's what you meant). it's really important for my usage. This is the PR #34 .

Thanks for the quick response and discussion!

YuriGor commented 5 years ago

Ok, I will review it tonight. Thank you for contribution!

YuriGor commented 5 years ago

Merged and published as v4.2.14 Please test and put a star if everything still works :wink: