fastify / fastify-rate-limit

A low overhead rate limiter for your routes
MIT License
504 stars 71 forks source link

Bug in LRU cache can cause persistent server failure #344

Closed tobiasdcl closed 12 months ago

tobiasdcl commented 12 months ago

Prerequisites

Fastify version

4.24.3

Plugin version

9.0.0

Node.js version

20.9.0

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

Ubuntu 22.04

Description

Hey folks, first of all thanks for building and maintaining this library ❤️

I found a bug in the most recent version of fastify-rate-limit (9.0.0 at the time of writing) when using the default LocalStore.

Summary

Based on my current understanding the root cause is a bug in the underlying LRU implementation toad-cache - I will create a dedicated issue there and link it here (update: done).

Impact

If triggered, all routes protected by fastify-rate-limit will be persitently error out

Steps to Reproduce

create the following file asrepro.mjs and then run node repro.mjs

import Fastify from 'fastify'

(async () => {
  const fastify = Fastify({
    logger: true
  })
  await fastify.register(import('@fastify/rate-limit'), {
    global: true,
    cache: 5,
  })

  fastify.get('/', (_request, reply) => {
    reply.send({ hello: 'fastify' })
  })

  fastify.listen({ port: 8000 }, (err, address) => {
    if (err) throw err
    console.log('Server running on port 80')
  })

  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.1'
  })

  /**
   * >>> This is triggering the bug in the LRU cache <<<
   */
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.1'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.2'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.3'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.4'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.5'
  })

  /**
   * This request will cause the first cache evict because it is the 6th ip while the cache size is set to 5
   * Normally this is not a problem but in this case it causes the `evict` function of the LRU cache to be called - causing a failure. Due to the nature of the bug all following requests will fail and the application can not recover on its own
   */
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.6'
  })
})()

raw output from node repro.mjs:

{"level":30,"time":1701280599225,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-1","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.1"},"msg":"incoming request"}
{"level":30,"time":1701280599227,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-1","res":{"statusCode":200},"responseTime":1.9484590291976929,"msg":"request completed"}
{"level":30,"time":1701280599227,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-2","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.1"},"msg":"incoming request"}
{"level":30,"time":1701280599227,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-2","res":{"statusCode":200},"responseTime":0.15683400630950928,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-3","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.2"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-3","res":{"statusCode":200},"responseTime":0.140083909034729,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-4","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.3"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-4","res":{"statusCode":200},"responseTime":0.09191691875457764,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-5","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.4"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-5","res":{"statusCode":200},"responseTime":0.12541699409484863,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-6","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.5"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-6","res":{"statusCode":200},"responseTime":0.0599590539932251,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-7","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.6"},"msg":"incoming request"}
{"level":50,"time":1701280599229,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-7","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.6"},"res":{"statusCode":500},"err":{"type":"TypeError","message":"Cannot read properties of null (reading 'key')","stack":"TypeError: Cannot read properties of null (reading 'key')\n    at LruObject.evict (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:406:30)\n    at LruObject.set (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:470:12)\n    at LocalStore.incr (/workspace/node_modules/@fastify/rate-limit/store/LocalStore.js:41:12)\n    at res (/workspace/node_modules/@fastify/rate-limit/index.js:214:15)\n    at new Promise (<anonymous>)\n    at Object.<anonymous> (/workspace/node_modules/@fastify/rate-limit/index.js:213:25)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"},"msg":"Cannot read properties of null (reading 'key')"}
{"level":30,"time":1701280599229,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-7","res":{"statusCode":500},"responseTime":0.6233330965042114,"msg":"request completed"

prettified stacktrace:

[17:42:51.262] ERROR (63715): Cannot read properties of null (reading 'key')
    reqId: "req-7"
    req: {
      "method": "GET",
      "url": "/",
      "hostname": "localhost:80",
      "remoteAddress": "10.0.0.6"
    }
    res: {
      "statusCode": 500
    }
    err: {
      "type": "TypeError",
      "message": "Cannot read properties of null (reading 'key')",
      "stack":
          TypeError: Cannot read properties of null (reading 'key')
              at LruObject.evict (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:406:30)
              at LruObject.set (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:470:12)
              at LocalStore.incr (/workspace/node_modules/@fastify/rate-limit/store/LocalStore.js:41:12)
              at res (/workspace/node_modules/@fastify/rate-limit/index.js:214:15)
              at new Promise (<anonymous>)
              at Object.<anonymous> (/workspace/node_modules/@fastify/rate-limit/index.js:213:25)
              at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    }

Expected Behavior

Expected Behavior:

Actual Behavior:

Uzlopak commented 12 months ago

@gurgunday @kibertoad

tobiasdcl commented 12 months ago

I can submit a PR but would like some guidance about the approach. Given that the issue is in an underlying library (if my understanding is correct) I am not sure if you want to revert to the prev. LRU cache or wait for a fix in toad-cache

tobiasdcl commented 12 months ago

Here are also some more details. The problem is that the lru.first is set to null if a value is first set and then retrieved via get

Bildschirmfoto 2023-11-29 um 18 39 29


Bildschirmfoto 2023-11-29 um 18 40 28

and once lru.first is set to null it is just a matter of time before reaching the cache limit causing the first evict call - from that point on it is game over 😅

Uzlopak commented 12 months ago

probably has to be fixed in toad-cache and not here.

tobiasdcl commented 12 months ago

probably has to be fixed in toad-cache and not here.

I created a PR 🙌

Uzlopak commented 12 months ago

Nice job!

gurgunday commented 12 months ago

Thank you so much for the PR!

kibertoad commented 12 months ago

@tobiasdcl 9.0.1 with the fix was released! thank you for the report and the PR

tobiasdcl commented 12 months ago

@tobiasdcl 9.0.1 with the fix was released! thank you for the report and the PR

Awesome, thank you for the super fast fix 🥳