isaacs / node-lru-cache

A fast cache that automatically deletes the least recently used items
http://isaacs.github.io/node-lru-cache/
ISC License
5.38k stars 353 forks source link

calculatedSize can go out of sync with sum(sizes) when using fetch() #269

Closed nornagon closed 1 year ago

nornagon commented 1 year ago

I was observing some strange OOM behavior, which I think turned out to be an unboundedly-large free list similar to #227. I added some diagnostics and it looked like evict was being called when there was nothing left in the cache to evict, because the calculatedSize was not correct.

I added the following check:

diff --git a/index.js b/index.js
index 718c567..0c0835e 100644
--- a/index.js
+++ b/index.js
@@ -372,6 +372,13 @@ class LRUCache {
     return false
   }

+  checkSizeConsistency() {
+    if (!this.sizes) return
+    let sizeSum = 0
+    for (const i of this.indexes()) sizeSum += this.sizes[i]
+    if (sizeSum !== this.calculatedSize)
+      throw new Error('calculated size inconsistent')
+  }
   initializeSizeTracking() {
     this.calculatedSize = 0
     this.sizes = new ZeroArray(this.max)
@@ -645,6 +652,7 @@ class LRUCache {
         this.disposeAfter(...this.disposed.shift())
       }
     }
+    this.checkSizeConsistency()
     return this
   }

and discovered that while this passes all tests, this check trips in my usage:

Error: calculated size inconsistent
    at LRUCache.checkSizeConsistency (.../node_modules/lru-cache/index.js:380:13)
    at LRUCache.set (.../node_modules/lru-cache/index.js:655:10)
    at cb (.../node_modules/lru-cache/index.js:742:14)

i.e. when the cb comes back from a fetch() promise, and set is called, it results in an inconsistency between the sizes in sizes and the total calculatedSize.

I haven't yet tracked down the root cause of this, but in case you have some ideas I thought I'd throw a bug up here!

nornagon commented 1 year ago

Ah, I think I might have a lead as to what's going on... let me know if this sounds like it could plausibly be the cause of the behavior above:

  1. fetch() is called
  2. a "placeholder" entry is put into the cache with size 0
  3. sometime later, the fetchMethod promise returns
  4. the cache is full, so some entries must be evicted in order to make room for the new value. this happens here.
  5. the "placeholder" entry is one of the entries evicted
  6. index now points to an invalid entry, because it was removed from the data structure
  7. ... but set() carries on and calls moveToTail() on an invalid index, corrupting the data structure
nornagon commented 1 year ago

Added a failing test over in https://github.com/isaacs/node-lru-cache/pull/270 that illustrates this behavior. I'm pretty confident now that the above scenario is the cause of this behavior.