nodeshift / opossum

Node.js circuit breaker - fails fast ⚡️
https://nodeshift.dev/opossum/
Apache License 2.0
1.3k stars 107 forks source link

Max stack depth exceeded when benchmarking #827

Closed jdmarshall closed 12 months ago

jdmarshall commented 1 year ago

Node.js Version:

Symptoms are worst on Node 18

Operating System:

OS X

Steps to Produce Error:

I'm still working on this, but in CI I saw these errors reproducibly, while everything seemed fine locally, until I increased the duration of the tests.

RangeError: Maximum call stack size exceeded
    at /Users/jason.marshall/Projects/hydra-dev/src/node_modules/agent-breaker/node_modules/opossum/lib/status.js:124:31
    at Array.reduce (<anonymous>)
    at get stats [as stats] (/Users/jason.marshall/Projects/hydra-dev/src/node_modules/agent-breaker/node_modules/opossum/lib/status.js:115:33)
    at get stats [as stats] (/Users/jason.marshall/Projects/hydra-dev/src/node_modules/agent-breaker/node_modules/opossum/lib/circuit.js:477:25)
    at fail (/Users/jason.marshall/Projects/hydra-dev/src/node_modules/agent-breaker/node_modules/opossum/lib/circuit.js:864:25)
    at handleError (/Users/jason.marshall/Projects/hydra-dev/src/node_modules/agent-breaker/node_modules/opossum/lib/circuit.js:823:5)
    at /Users/jason.marshall/Projects/hydra-dev/src/node_modules/agent-breaker/node_modules/opossum/lib/circuit.js:709:17
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Emitted 'error' event on Domain instance at:
    at emit (node:internal/process/promises:147:33)
    at processPromiseRejections (node:internal/process/promises:283:27)
    at process.processTicksAndRejections (node:internal/process/task_queues:96:32)

I will attempt to come up with a repro case but I'm wondering if this is a configuration issue/documentation problem.

lholmquist commented 1 year ago

are you able to share any bits of the code you are running?

jdmarshall commented 1 year ago

I'm working on that. I recently switched from benchmark.js to tinybench (benchmark does not handle async properly) and that's when this error started happening. So it could be 2 problems I'm looking at here. I'm currently trying to unroll our wrapper code from the stack so it's just tinybench and opossum.

First attempt hit OOM before generating the error, so I think there's some more digging there.

jdmarshall commented 1 year ago

I suspect the OOM is coming mostly from creating opossum instances and then not shutting them down, at a rate of about 12k/s. I need to work on that separately. Turning off What I'm concerned about is that the Max stack error feels like some sort of chaining, where a new listener or a callback is added but it chains to the old one if it's present.

I haven't seen anything in the code like that so far. From the stack trace it seems to be in the error handling code so I've mostly been focusing on that.

jdmarshall commented 1 year ago

Still trying to isolate this. The only information I have so far is that my original tinybench benchmark fails in Node 18 and 20, but not in Node 16, and not when I run opossum by itself. I don't think any of these modules are using global state or monkey patching so I'm partly stumped.

lholmquist commented 1 year ago

what version are you using? if you are using the latest, which i think is 8.1.2, could you try 7.x to see if it is something that was added?

jdmarshall commented 1 year ago

I believe I also tested it with 7.x first, and then upgraded when that didn't work.

One thing I found is that tinybench is keeping an exhorbitant amount of memory for fast little calls. I've fiddled with that and stopped the bleeding. Still no idea what the failure mode is. I'm gonna have to step through a bit.

jdmarshall commented 1 year ago

Oh, damn.

node_modules/opossum/lib/status.js:124

        acc.latencyTimes.push.apply(acc.latencyTimes, val.latencyTimes || []);

Yeah so that's not going to perform well, and it will in fact create a stack overflow exception if you manage to stuff enough elements into the array for a single update.

If you're supporting node 16 the apply() can go, but you might just want a concat here.

lholmquist commented 12 months ago

closed via #830

lholmquist commented 12 months ago

Just released 8.1.3 on npm with this fix