Closed avatarneil closed 2 years ago
Latest commit: 5c019642b86093e773292e92a570143a92aa432b
The changes in this PR will be included in the next version bump.
Not sure what this means? Click here to learn what changesets are.
Click here if you're a maintainer who wants to add another changeset to this PR
Wow, I rarely see issues related to the node gc. Were the K8s pods running out of memory?
If the problem is associated with ingesting new policies, maybe the policy objects are trying to load objects that are too large into RAM at the same time. Would it be possible to do this processing in a "streaming"/JIT fashion?
Wow, I rarely see issues related to the node gc. Were the K8s pods running out of memory?
If the problem is associated with ingesting new policies, maybe the policy objects are trying to load objects that are too large into RAM at the same time. Would it be possible to do this processing in a "streaming"/JIT fashion?
@macsj200 Yeah, we were seeing pods run out of memory. The problem I suspect, was that we were pseudo-parallelizing the computation of all the policy compliance by wrapping the processPolicy
calls in a Promise.all()
. Hypothetically, let's assume that processPolicy looks something like this in terms of sync/async-ness:
1. Async call to load stuff into memory
2. Synchronous call to do some compute on it
3. Some async call(s)
By wrapping those in a Promise.all
, we could've gotten to a state where many of the promises were stuck on an await
for some I/O on step 3, while allowing steps 1 & 2 to kick off for some other promises. This could have resulted in what appeared as memory leakage in the logs. Unfortunately, to add my manual-GC workaround, I had to switch this Promise.all into synchronous-promise invocation, which means that technically that could've been the fix in-and-of itself. I need to do some more manual testing, but by the time I cut this PR I was just trying to ensure immediate stability!
📚 Purpose
This PR fixes some serious performance regressions/weird memory behaviors that were observed following the previous release. It's unknown why these started manifesting so strongly in v13, as it's likely that the underlying performance issues have been around for a while, but these patches resolve them.
👌 Resolves:
Strange performance issues
📦 Impacts:
See changesets