Open gabotechs opened 1 year ago
Thanks for the great report, sorry it's taken me a week to get to this. I'll have a closer look tomorrow!
@gabotechs have you had a chance to look at my PR and try out the built packages?
Any progress on this issue? We have a memory leak and we are trying to identify where it is coming from too, but we have disabled the reporting and still see a memory spike.
@juancarlosjr97 are you able to create a minimal reproduction? That's certainly interesting that you see the issue with reporting disabled, they may be unrelated.
I will create one @trevor-scheer and share it. The issue has been narrowed down to many unique operations for the same query.
query GetBestSellers($category1: ProductCategory) {
bestSellers(category: $category1) {
title
}
}
query GetBestSellers($category2: ProductCategory) {
bestSellers(category: $category2) {
title
}
}
The $category1
and $category2
could have the same value but they are identified as two different operations. We reverted yesterday to using the same operation and the memory increase issue has been resolved. However, it reveals that the memory used has not been released (nearly 12 hours after the change of the operations) which might point to another issue of garbage collection.
We have tested this behaviour on the Apollo v3 and Apollo v4, with all plugins disabled, and still the same behaviour.
Also, profiling revealed that what is taking more memory from and continuously growing is the graphql library, specifically from the node_modules/graphql/language
.
This might be another issue, and if it is, I will raise a separate issue for it if has not been raised already internally by Apollo by then.
We have raised this internally with Apollo and the reference number is 9623
.
@juancarlosjr97 given what you've narrowed it down to so far this does seem unrelated, so a separate issue would be preferred.
How sure are you that:
for the same query
is relevant? Have you ruled out just "many unique operations" by itself? The "same query excluding variables" part would be a surprising twist to the issue.
The graphql/language
leads us to parsing and printing from graphql-js
so it would be interesting to run the set of queries that make up your reproduction against just the graphql-js
parser directly and see if you still see the same issue with GC / memory usage increasing over time.
@trevor-scheer If they're saying that they change the name of the variable ($category1
vs $category2
) then I think most of our systems (including usage reporting) will consider them as distinct operations.
@glasser right, I'm asking for clarification on exactly that. Like I don't think that it matters that the operation is entirely the same minus the variable name. It sounds more like a "many different operations" problem, more generally. I would be surprised if the issue was limited to that specific nuance.
Right, I agree that the problem is likely "many different operations" β just that it might not be obvious that we treat those operations as distinct.
Thank you for the replies @trevor-scheer and @glasser. We can confirm at this point, that the memory has been steady since we the consumer changed the query to be identified as the same operation using the same queries.
I will work on reproducing the issue and when I have done it, I will raise another issue with all the details and a repository to clone and replicate the bug, as the memory is not getting released which is the problem
@trevor-scheer and @glasser I created a project with instruction that shows the memory leak issue https://github.com/juancarlosjr97/apollo-graphql-federation-memory-leak. I am going to be raising another issue with all the details of the issue.
Issue Description
We've been running Apollo server for a while in a couple of APIs, and we have always noticed a memory leak in both, which appeared to be linearly proportional to the number of requests handled by each API.
While investigating the memory leak, v8 heap snapshots where taken from the running servers at two different timestamps, with a distance of 6 hours. The latter heap snapshot was compared to the previous one in order to track what new objects are in the JS heap that where not 6 hours before, and there are thousands of new retained
Request
-like objects that reference the "usage-reporting.api.apollographql.com" host, and hundreds ofTLSSocket
new objects that reference this same host.Some objects that are leaking in the JS memory:
Request-like object
``` body::Object@13534193 cache::"default"@729π client::Object@13537293 credentials::"same-origin"@54437π cryptoGraphicsNonceMetadata::""@77π destination::""@77π done::system / Oddball@73π headersList::HeadersList@13537317 historyNavigation::system / Oddball@75π initiator::""@77π integrity::""@77π keepalive::system / Oddball@75π localURLsOnly::system / Oddball@75π map::system / Map@130579 method::"POST"@49427π mode::"cors"@84517π origin::system / Oddball@67π parserMetadata::""@77π policyContainer::Object@13537295 preventNoCacheCacheControlHeaderModification::system / Oddball@75π priority::system / Oddball@71π properties::system / PropertyArray@13537319 redirect::"follow"@53093π referrer::"no-referrer"@85507π referrerPolicy::system / Oddball@67π reloadNavigation::system / Oddball@75π replacesClientId::""@77π reservedClient::system / Oddball@71π responseTainting::"basic"@102749π serviceWorkers::"none"@519π taintedOrigin::system / Oddball@75π timingAllowFailed::system / Oddball@75π unsafeRequest::system / Oddball@75π url::URL@13537301TLSSocket object
```Here is a chart showing the memory usage of the last two days for one of the APIs:
The first left half of the chart (the first day) the Apollo server was running with the
ApolloServerPluginUsageReporting
enabled, and the memory kept increasing linearly, and the last half (the second day), exactly the same code was running but passing theApolloServerPluginUsageReportingDisabled
to the plugins, so that the usage reporting is disabled. In this last case no memory was being leaked.We are using
@apollo/server
with version4.3.0
Link to Reproduction
https://github.com/GabrielMusatMestre/apollo-server-memory-leak-repro
Reproduction Steps
Steps are described in the README.md of the reproduction repo.
This is not a reliable reproduction, as the memory leak might start being noticeable by running the server under heavy load for hours or days, and it needs a properly configured
APOLLO_KEY
andAPOLLO_GRAPH_REF
that will actually publish usage reports to Apollo.