mobxjs / mobx

Simple, scalable state management.
http://mobx.js.org
MIT License
27.52k stars 1.77k forks source link

Uncaught RangeError: Maximum call stack size exceeded #237

Closed jefffriesen closed 8 years ago

jefffriesen commented 8 years ago

@mweststrate suggested I post this as a ticket after discussing the problem on Gitter. I also saw issue #236 but decided to not post it on there in case it's a different problem.

On Gitter, I mentioned maxing out the call stack on a filter. I realized it was also happening on a simpler function so I'll post that here.

Context: After importing lots of records, I'm assigning one or more classifications to that record based on a description field. So for 1000 records, I will have:

  1. 1000 records (array of objects, each object having a description key)
  2. Classification array: Each element in the array is an array of classification tokens (strings, such as 'EL', 'PB'). Index of each array matches up with the original records.
classifications = [
  ['EL'],
  ['RF', 'PB'],
  [],
  ...
]

Everything works great when I load 1000 records. When I load 88,000 records I get the max call stack error.

Here is the relevant bits in the observable store:

  @observable records = []
  @observable classifications = []
  @observable includeKeywords = {}
  @observable excludeKeywords = {}

  updateClassifications() {
    console.log('includeKeywords: ', this.includeKeywords)
    console.log('records: ', this.records)
    // blows call stack here.....
    this.classifications = updateClassifications(this.includeKeywords, this.excludeKeywords, this.records)
  }

  updateIncludeKeywords(classification, words) {
    this.includeKeywords[classification] = words
  }

  updateExcludeKeywords(classification, words) {
    this.excludeKeywords[classification] = words
  }

Here is updateClassifications(). Apologize for all of the code , but I don't know how to cut this down and still make it functional.

// Iterate through each permit row, lowercasing and seeing if keyword is included
// For every description, we return an array of classification codes
// [ [], ['EL'], ['EL', 'PB'] ]
export default function updateClassifications(includeKeywords, excludeKeywords, records) {
  const includeKeywordMap = keywordTransform(includeKeywords)
  // result of keywordTransform:  Object {AC: Array[1], EL: Array[3], PB: Array[2], RF: Array[2]}
  const excludeKeywordMap = keywordTransform(excludeKeywords)
  return chain(records)
    .map(record => ' ' + record.description.toLowerCase() + ' ')
    .map(description => descriptionToClasses(description, includeKeywordMap, excludeKeywordMap))
    .value()
}

function keywordTransform(wordMap) {
  return reduce(wordMap, (acc, val, key) => {
    const wordArray = map(val.split(','), word => word ? toLower(word) : false)
    acc[key] = compact(wordArray)
    return acc
  }, {})
}

function descriptionToClasses(description, includeKeywordMap, excludeKeywordMap) {
  const includeMatched = compact(map(includeKeywordMap, (words, key) => {
    return includesAnyStrings(description, words) ? key : false
  }))
  const excludeMatched = compact(map(excludeKeywordMap, (words, key) => {
    return includesAnyStrings(description, words) ? key : false
  }))
  return reject(includeMatched, word => includes(excludeMatched, word))
}

function includesAnyStrings(description, keywordArray) {
  return some(keywordArray, word => includes(description, word))
}

Here is the error:

screen shot 2016-05-04 at 12 50 07 pm

and console.log of the keywords and records (records are the ones that I have 88,000+ rows):

screen shot 2016-05-04 at 12 47 30 pm

Let me know if I can provide any more info. Thanks

mweststrate commented 8 years ago

Just pushed version 2.1.6 which should address this issue. It might be the case that the call stack issue is disappeared but that the operation is slow, just let me know if that is the case.

jefffriesen commented 8 years ago

Yep, that fixed it. Thank you. I'm curios how you knew that splatting the arrays was causing the problems.

mweststrate commented 8 years ago

As usual, stack overflow http://stackoverflow.com/questions/22123769/rangeerror-maximum-call-stack-size-exceeded-why :)

jefffriesen commented 8 years ago

Really interesting thread. Thanks for sharing. The call stack isn't being maxed out, but classifying and filtering are pretty slow (5-7 seconds to perform something). At this point I think this is probably on me to re-evaluate and optimize how I'm doing the classifying and filtering. Or break the data set up into smaller pieces.

thanks

mweststrate commented 8 years ago

Yes reactive sorting is pretty expensive on large collections, because there are many values and function calls to be tracked. After all the result of your sort depends on any value in the array that is used in the comparison function. (Not only the array can change, but also the values in the objects could influence the result of the sort and hence they need to be tracked). There are some optimizations that can be applied, but they are not generically applicable.

mweststrate commented 8 years ago

See also #166 which will improve the time needed to update a derived sorted array after an update a lot. The initial sort will remain expensive though