balderdashy / sails

Realtime MVC Framework for Node.js
https://sailsjs.com
MIT License
22.84k stars 1.95k forks source link

Memory leak when lifting and lowering Sails repeatedly in automated tests #6804

Open ath88 opened 5 years ago

ath88 commented 5 years ago

Node version: v10.16.0 Sails version (sails): 1.2.3 ORM hook version (sails-hook-orm): Not using this. Sockets hook version (sails-hook-sockets): Not using this. Organics hook version (sails-hook-organics): Not using this. Grunt hook version (sails-hook-grunt): Not using this. Uploads hook version (sails-hook-uploads): Not using this. DB adapter & version (e.g. sails-mysql@5.55.5): Not using this. Skipper adapter & version (e.g. skipper-s3@5.55.5): Not using this.


I've looked through all issued related to memory leaks, but none of them describes my case.

When testing a sails application, we 'lift' and 'lower' for each test. We have an extensive test suite, and we now experience OOM errors when running tests. We use the supported constructor as mentioned here: https://sailsjs.com/documentation/concepts/programmatic-usage

I managed to make a tiny script that clearly shows the increasing memory footprint.

const Sails = require("sails").Sails;

(async () => {
  for(let i = 0; i < 500000; i++){
    await startStop();
    if(i % 100 === 0) printUsage();
  }
})();

async function startStop() {
  const sails = new Sails();
  await sails.lift({ log: { noShip: true }, session: { secret: 'x' } });
  await sails.lower();
}

function printUsage() {
  const used = process.memoryUsage().heapUsed / 1024 / 1024;
  console.log(`Process uses ${Math.round(used * 100) / 100} MB`);
}

It takes awhile, but eventually it runs out of memory. Some memory is clearly being freed by the garbage collector, but some is never freed. Output from my console is:

ath88@lipton:~/test/sails-test$ node index.js 
Process uses 23.84 MB
Process uses 29.99 MB
Process uses 34.82 MB
...
Process uses 1271.33 MB
Process uses 1286.41 MB
Process uses 1287.27 MB

<--- Last few GCs --->

[11916:0x348d020]   500144 ms: Mark-sweep 1239.6 (1418.2) -> 1239.6 (1395.2) MB, 713.7 / 0.1 ms  (average mu = 0.830, current mu = 0.000) last resort GC in old space requested
[11916:0x348d020]   500840 ms: Mark-sweep 1239.6 (1395.2) -> 1239.6 (1395.2) MB, 695.9 / 0.1 ms  (average mu = 0.711, current mu = 0.000) last resort GC in old space requested

<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0xf994af5be1d]
    1: StubFrame [pc: 0xf994af5a1bb]
Security context: 0x345970a9e6e9 <JSObject>
    2: set [0x345970a937b9](this=0x221dd5702859 <JSWeakMap>,0x300fe1705e19 <JSFunction wrapper (sfi = 0x16c7dcce5979)>,0x300fe1705da9 <JSArray[10]>)
    3: /* anonymous */ [0x221dd57069d1] [/home/ath88/test/sails-test/node_modules/@sailshq/lodash/lib/index.js:~7645] [pc=0xf994af707d4](this=0x1e819d409ad1 <JSFunction lodash...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: 0x8f9d10 node::Abort() [node]
 2: 0x8f9d5c  [node]
 3: 0xaffd0e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xafff44 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xef4152  [node]
 6: 0xf0396f v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [node]
 7: 0xecc9c6 v8::internal::Factory::AllocateRawArray(int, v8::internal::PretenureFlag) [node]
 8: 0xecd24a v8::internal::Factory::NewFixedArrayWithFiller(v8::internal::Heap::RootListIndex, int, v8::internal::Object*, v8::internal::PretenureFlag) [node]
 9: 0xecd2f0 v8::internal::Handle<v8::internal::FixedArray> v8::internal::Factory::NewFixedArrayWithMap<v8::internal::FixedArray>(v8::internal::Heap::RootListIndex, int, v8::internal::PretenureFlag) [node]
10: 0x10089b1 v8::internal::HashTable<v8::internal::ObjectHashTable, v8::internal::ObjectHashTableShape>::NewInternal(v8::internal::Isolate*, int, v8::internal::PretenureFlag) [node]
11: 0x1008a0e v8::internal::HashTable<v8::internal::ObjectHashTable, v8::internal::ObjectHashTableShape>::New(v8::internal::Isolate*, int, v8::internal::PretenureFlag, v8::internal::MinimumCapacity) [node]
12: 0x101926f v8::internal::HashTable<v8::internal::ObjectHashTable, v8::internal::ObjectHashTableShape>::EnsureCapacity(v8::internal::Handle<v8::internal::ObjectHashTable>, int, v8::internal::PretenureFlag) [node]
13: 0x102b618 v8::internal::ObjectHashTable::Put(v8::internal::Handle<v8::internal::ObjectHashTable>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int) [node]
14: 0x102b836 v8::internal::JSWeakCollection::Set(v8::internal::Handle<v8::internal::JSWeakCollection>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int) [node]
15: 0x1142060 v8::internal::Runtime_WeakCollectionSet(int, v8::internal::Object**, v8::internal::Isolate*) [node]
16: 0xf994af5be1d 
Aborted (core dumped)

As far as I can see from snapshots with Chrome DevTools, async is getting re-required and never removed from memory.

If would be awesome to eliminate this memory leak, so we can continue our testing practices, with the supported 'programmatic usage'.

sailsbot commented 5 years ago

@ath88 Thanks for posting! We'll take a look as soon as possible.

In the mean time, there are a few ways you can help speed things along:

Please remember: never post in a public forum if you believe you've found a genuine security vulnerability. Instead, disclose it responsibly.

For help with questions about Sails, click here.

whichking commented 5 years ago

Hey, @ath88!

Memory leaks are no joke, so we appreciate the community helping us keep an eye out for them. The standard procedure for identifying and reporting memory leaks is as follows:

  1. Check that the app you're testing on uses the recommended production settings, as described in the deployment and scaling sections of the docs.

  2. Review your app-level code, and try to narrow down the leak to a single endpoint. It isn't unusual for Node apps to end up with memory leaks resulting from unhandled errors.

  3. If you're unable to locate the specific endpoint or bit of app code responsible for your increasing memory usage, try replicating the memory leak in a fresh Sails app with no bells or whistles and using the production settings recommended in the deployment docs. If you are able to replicate the memory leak in the fresh Sails app, please create an example repo reproducing the problem. Include standard version info and replication instructions.

ath88 commented 5 years ago

Hello madisonhicks, and thanks for the reply. :)

  1. We are not running tests with NODE_ENV=production. If sails leaks memory during tests, I would still consider that an error - no? That said, I just tried running the sample I supplied in my original post with NODE_ENV=production, and the process maintains the same growing memory footprint. The rest of the production configurations are mostly related to configuration for connectivity.

  2. There are literally only 16 lines of app-level code in my sample (unless you mean something else), so it would be difficult for me to narrow it down further, without looking into the Sails source.

  3. See original post, run it with NODE_ENV=production.

There is a curious piece of console.log when running it with NODE_ENV=production, warning me that "connect.session() MemoryStore is not designed for a production environment, as it will leak memory, and will not scale past a single process". My initial assumption is that it will not as much leak memory, but rather it won't invalidate sessions, which will stay in memory for the duration of the process. In the example posted, sessions are not being made, so my assumption is that this isn't the culprit.

johnabrams7 commented 5 years ago

@ath88 Thanks for all the detailed info and examples so far. If you don't mind, can you make us a repro of this with a new sails app with minimal content to analyze & further understand these OOM errors? We're curious to see greater detail on our end.

ath88 commented 5 years ago

@johnabrams7 What do you mean by 'a repro in a new sails app'? :)

johnabrams7 commented 5 years ago

@ath88 I mean a repo that reproduces this issue in a new sails app. šŸ‘

ath88 commented 5 years ago

So - you want a GitHub repo with just that one script? I can do that.

johnabrams7 commented 5 years ago

Yes, that should work. Essentially a minimal app that has the test you're to see this so we can try it out ourselves.

ath88 commented 5 years ago

Here you go. https://github.com/ath88/sails-oom-test Just do npm i and npm start.

johnabrams7 commented 5 years ago

Awesome, many thanks!

johnabrams7 commented 5 years ago

@ath88 I was able to reproduce the memory leak on our end and will have this further investigated with the team for what's causing it along with a resolution. Will get back with you with further updates. šŸ‘

Screen Shot 2019-07-12 at 5 11 39 PM Screen Shot 2019-07-12 at 5 12 18 PM
mikermcneil commented 5 years ago

@johnabrams7 would you try again using manual gc() or let it run more times with the hopes of reproducing the out of memory error?

johnabrams7 commented 5 years ago

@mikermcneil For sure, I was able to reproduce the memory error twice by letting it run until the crash. This occurs for both the original repo and the PR (screens from the PR):

Screen Shot 2019-07-18 at 12 59 45 PM

Last few memory usage logs before the crash.

Screen Shot 2019-07-18 at 1 00 12 PM

I verified that lifting and lowering Sails programmatically 50,000 times causes the process to run out of memory.

ath88 commented 5 years ago

For the record - i just inserted a manual gc(), for every lift and lower, using --expose_gc, and the same pattern shows.

johnabrams7 commented 5 years ago

@ath88 Thanks for the follow-up, was that with Mike's PR changes or the original version?

ath88 commented 5 years ago

@johnabrams7 With both, now.

johnabrams7 commented 5 years ago

@ath88 At the moment, I've documented this in our development plans to improve memory handling for pragmatically lifting and lowering Sails (e.g. stress testing the process memory handling and similar). We welcome additional input and experiences for related applications and scenarios that programmatically lift and lower Sails in a similar fashion.

ath88 commented 5 years ago

My personal use case for lifting and lowering sails in a similar fashion, is for the setup and teardown of test. Each test should have a predictable state, unrelated to any previous tests, and therefore a clean sails application is preferred. I hope that makes sense to you. I can't think of other use cases right now.

johnabrams7 commented 5 years ago

@ath88 Makes sense, a perfectly valid use case. More advanced & powerful memory handling could ultimately help out with predictability and stability later down the line with more advancements in the framework. šŸ‘

eteaguewebs commented 4 years ago

Hello!

Long story short, but my place of work is upgrading a sails application from 0.12 to 1.4 (old, I know). We are encountering the exact issue @ath88 described in this thread, summarized here https://github.com/balderdashy/sails/issues/6804#issuecomment-513448842.

Have there been any updates/developments or workaround discovered to address this issue since 2019?

Thanks in advance!

mikermcneil commented 4 years ago

In repo that seeks to reproduce this, a couple of things (archived so i'll comment here):

so given that, makes sense that 500,000 Sails instances running at once is going to overwhelm RAM

Side note: That said: these programmatic methods should absolutely support await- would be happy to merge a PR to Sails that implements that using https://npmjs.com/package/parley

If anyone gets a chance to update @ath88's proof to make the lifting/lowering run in parallel, that would make this simulation give Scratch that! @eashaw just noticed that apparently I already updated @ath88's repo at some point to do just this. @eashaw or I will take a peek at that and report back.

ath88 commented 4 years ago

I think I remember back when I opened the issue, that I found something in sails that messes with require.cache. I know that probably isn't much to go on, but I thought I might as well mention it.

alxndrsn commented 4 years ago

I think I remember back when I opened the issue, that I found something in sails that messes with require.cache. I know that probably isn't much to go on, but I thought I might as well mention it.

@ath88 this happens in the include-all module, which is called at least once during sails startup.

@mikermcneil you've given some background to why this happens at https://github.com/balderdashy/sails/issues/4316#issuecomment-376001225; is there currently an option to disable it?

alxndrsn commented 4 years ago

I think I remember back when I opened the issue, that I found something in sails that messes with require.cache.

@ath88 Great pointer, thanks - this looks like it's spot on, at least for most of the memory leak.

I've adapted the examples above at https://github.com/alxndrsn/sails-memory-leak to demonstrate the memory leak, and test the effects of disabling include-all's cache deletion for all or some of it's calls.

Results I'm getting:

include-all deletes require.cache? heap change after 500 iterations
always 633.65 MB
never 1.28 MB
only when loading /config 633.60 MB

This require.cache-clearing seems like surprising behaviour from sails, given that node's docs say:

Modules are cached after the first time they are loaded. This means (among other things) that every call to require('foo') will get exactly the same object returned, if it would resolve to the same file. This is an important feature.

ā€” https://nodejs.org/api/modules.html#modules_caching

Apart from the memory leak, it also means that any expectation you had that your source files will be parsed/run exactly once may be violated.

ath88 commented 4 years ago

I stumbled upon that very problem when using mongoose, since it utilizes a singleton pattern stored in the modules scope. We did some database interactions before starting sails, which resulted in two connections, instead of just one.