balderdashy / sails

Realtime MVC Framework for Node.js
https://sailsjs.com
MIT License
22.83k stars 1.95k forks source link

Difficulty debugging a memory Leak #7253

Open ericsj opened 2 years ago

ericsj commented 2 years ago

Node version: v12.22.9 Sails version (sails): 0.12.14 ORM hook version (sails-hook-orm): 1.0.9 Sockets hook version (sails-hook-sockets): 0.13.11 Organics hook version (sails-hook-organics): -- Grunt hook version (sails-hook-grunt): -- Uploads hook version (sails-hook-uploads): -- DB adapter & version (e.g. sails-mysql@5.55.5): sails-postgresql@0.11.4 Skipper adapter & version (e.g. skipper-s3@5.55.5): skipper@0.7.6


I manage a Node.js server that tipically occupies ~65MB of RAM. After opening a certain page, the memory usage increases drastically, until it freezes the PC or throws a FATAL ERROR with a heap related error. In the second screenshot we can note an array emerging and occupying a large portion of memory. I found an object that is really present in the code (that purple providerinformation in the inferior part). But it doesn't tell much, it's a just a model name and page name. In an older version of the code that is running on prd server it doesn't happen so I don't think it is caused by Sails, since I didn't update sails (at least not in package.json) and I didn't find memory leak issues in my node version (the server runs an older version). How can I proceed? Notes:

Screenshot from 2022-08-30 15-30-45 Screenshot from 2022-08-30 15-34-19

Error message:

<--- Last few GCs --->

[11733:0x563472f0af10] 182853 ms: Scavenge 2037.3 (2046.9) -> 2034.8 (2047.6) MB, 8.2 / 0.0 ms (average mu = 0.137, current mu = 0.087) allocation failure [11733:0x563472f0af10] 182873 ms: Scavenge 2038.1 (2047.6) -> 2035.5 (2048.1) MB, 8.6 / 0.0 ms (average mu = 0.137, current mu = 0.087) allocation failure [11733:0x563472f0af10] 182894 ms: Scavenge 2038.7 (2048.1) -> 2036.2 (2056.9) MB, 9.6 / 0.0 ms (average mu = 0.137, current mu = 0.087) allocation failure

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x27d5da31b9a1 0: builtin exit frame: stringify(this=0x27d5da30b729 ,2,0x29315ef004b1 ,0x19341927f741 <JSArray[124555]>,0x27d5da30b729 )

1: json [0xa7e9a32a209] [/home/user/Desktop/tog/node_modules/@sailshq/express/lib/response.js:231] [bytecode=0x1d029d7bc129 offset=156](this=0x21ddafea9a29 <ServerResponse map = 0x196796078bf9>,0x19...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory 1: 0x7fa5b026c3cc node::Abort() [/lib/x86_64-linux-gnu/libnode.so.72] 2: 0x7fa5b019c67c [/lib/x86_64-linux-gnu/libnode.so.72] 3: 0x7fa5b0641e2a v8::Utils::ReportOOMFailure(v8::internal::Isolate, char const, bool) [/lib/x86_64-linux-gnu/libnode.so.72] 4: 0x7fa5b06421e4 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate, char const, bool) [/lib/x86_64-linux-gnu/libnode.so.72] 5: 0x7fa5b07f4509 [/lib/x86_64-linux-gnu/libnode.so.72] 6: 0x7fa5b0804dc6 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/lib/x86_64-linux-gnu/libnode.so.72] 7: 0x7fa5b0805c12 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/lib/x86_64-linux-gnu/libnode.so.72] 8: 0x7fa5b0806070 v8::internal::Heap::CollectAllGarbage(int, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/lib/x86_64-linux-gnu/libnode.so.72] 9: 0x7fa5b07b5ee8 v8::internal::StackGuard::HandleInterrupts() [/lib/x8664-linux-gnu/libnode.so.72] 10: 0x7fa5b08fb74b v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize(v8::internal::Handle, bool, v8::internal::Handle) [/lib/x8664-linux-gnu/libnode.so.72] 11: 0x7fa5b0900464 v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize(v8::internal::Handle, bool, v8::internal::Handle) [/lib/x8664-linux-gnu/libnode.so.72] 12: 0x7fa5b0900c9f v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize(v8::internal::Handle, bool, v8::internal::Handle) [/lib/x86_64-linux-gnu/libnode.so.72] 13: 0x7fa5b0901e2c v8::internal::JsonStringifier::Stringify(v8::internal::Handle, v8::internal::Handle, v8::internal::Handle) [/lib/x86_64-linux-gnu/libnode.so.72] 14: 0x7fa5b0901ebf v8::internal::JsonStringify(v8::internal::Isolate, v8::internal::Handle, v8::internal::Handle, v8::internal::Handle) [/lib/x86_64-linux-gnu/libnode.so.72] 15: 0x7fa5b06ef2c6 v8::internal::Builtin_JsonStringify(int, unsigned long, v8::internal::Isolate*) [/lib/x86_64-linux-gnu/libnode.so.72] 16: 0x7fa5b112ec79 [/lib/x86_64-linux-gnu/libnode.so.72] [1] 11733 IOT instruction (core dumped) node --inspect app

sailsbot commented 2 years ago

@ericsj Thanks for posting! We'll take a look as soon as possible.

In the mean time, there are a few ways you can help speed things along:

  • look for a workaround. (Even if it's just temporary, sharing your solution can save someone else a lot of time and effort.)
  • tell us why this issue is important to you and your team. What are you trying to accomplish? (Submissions with a little bit of human context tend to be easier to understand and faster to resolve.)
  • make sure you've provided clear instructions on how to reproduce the bug from a clean install.
  • double-check that you've provided all of the requested version and dependency information. (Some of this info might seem irrelevant at first, like which database adapter you're using, but we ask that you include it anyway. Oftentimes an issue is caused by a confluence of unexpected factors, and it can save everybody a ton of time to know all the details up front.)
  • read the code of conduct.
  • if appropriate, ask your business to sponsor your issue. (Open source is our passion, and our core maintainers volunteer many of their nights and weekends working on Sails. But you only get so many nights and weekends in life, and stuff gets done a lot faster when you can work on it during normal daylight hours.)
  • let us know if you are using a 3rd party plugin; whether that's a database adapter, a non-standard view engine, or any other dependency maintained by someone other than our core team. (Besides the name of the 3rd party package, it helps to include the exact version you're using. If you're unsure, check out this list of all the core packages we maintain.)

Please remember: never post in a public forum if you believe you've found a genuine security vulnerability. Instead, disclose it responsibly.

For help with questions about Sails, click here.

mikermcneil commented 2 years ago

@ericsj Without looking at the code for this app, I think the most likely scenario here is your app has an action that is fetching too much of the database into memory at once. i.e. needs to be paginated (limit). This can sometimes come up indirectly, for example when trying to respond with a list of red cars with any current passengers who are wearing shoes. An antipattern is to look up all red cars first. But the problem is, there might be too many red cars to fit into memory. Another, simpler possibility is that a primary piece of data intended to be used by the action is just too big for your server's available RAM-- for example an action that returns an unpaginated list of millions of records.

ericsj commented 2 years ago

@ericsj Without looking at the code for this app, I think the most likely scenario here is your app has an action that is fetching too much of the database into memory at once. i.e. needs to be paginated (limit). This can sometimes come up indirectly, for example when trying to respond with a list of red cars with any current passengers who are wearing shoes. An antipattern is to look up all red cars first. But the problem is, there might be too many red cars to fit into memory. Another, simpler possibility is that a primary piece of data intended to be used by the action is just too big for your server's available RAM-- for example an action that returns an unpaginated list of millions of records.

I searched the bug source and found it by moving through the commit history while supervising the chrome inspector. The commit added a simple endpoint with a model.find({}, ...) operation without limit or pagination and called it twice. I solved it adding limit. Surprisingly the table of that model has only 22mb (see print attached), so I still don't understand the bug completely. Thank you by your help anyway. Old endpoint:

        ProviderInformation.find({}, function(err, records) {...

New:

      ProviderInformation.find({ where: { networkBuild: buildProvider.networkBuild }, limit: 1000 })
        .exec(function(err, providerInformationInSameBuild) { ...

Screenshot from 2022-09-06 23-14-29

zignuts-techno commented 1 year ago

@ericsj thanks for posting this, we are having a similar problem and we suspect a memory leak somewhere in the code or the sails framework. Our projects are running inside a docker containers. We are noticing that the sails process i.e. the docker container process, is keep on getting the memory gradually. We have observed this for quite some time and now we are sure that there is some memory leak somewhere. When we start the container afresh, it starts with less memory, and everyday it increases by a few MBs of memory, even without getting it freed by GC. Therefore after every 3-4 weeks we have to manually restart the process by restarting the container and then thing goes back to normal, and this process is repeated. If we do not do this activity then the entire system performance slows down. Note that these are just REST APIs without any frontend, gulp or grunt tasks.

I need help with - 1) how do we find memory leaks in sails project? 2) How can we be sure if memory leak is due to the source code or the sails framework itself?

@mikermcneil any help/suggestion would be highly appreciated.

ericsj commented 1 year ago

You can find memory leaks using the inspector: node --inspect index.js Open in the browser the link: chrome://inspect To find the source of the leak, navigated through the commit history and noted that the code above introduced the memory leak, by the memory usage graphs, and the only way to discover if the memory leak is caused by sails framework is changing the source code, but I guess 99% of the cases the memory leak is caused by the source code, so I suggest trying to find which commit introduced the leak and testing it