jestjs / jest

Delightful JavaScript Testing.
https://jestjs.io
MIT License
44.21k stars 6.46k forks source link

Jest leaks memory from required modules with closures over imports #6814

Open pk-nb opened 6 years ago

pk-nb commented 6 years ago

🐛 Bug Report

We are running into a issue in Jest 23.4.2 where Jest leaks and runs out of memory when running a moderately hefty test suite (~500 to 1000 test files). I believe I have isolated this to the require system in Jest and it is not the fault of other packages. Even with the most minimal recreation it leaks around 2-6MB per test.

This is very similar to https://github.com/facebook/jest/issues/6399 but I opted to make a new issue as I think it's not specific to packages or the node environment. I think it's the also the source or related to the following issues as well but didn't want to sidetrack potentially different issues and conversations.

https://github.com/facebook/jest/issues/6738 https://github.com/facebook/jest/issues/6751 https://github.com/facebook/jest/issues/5837 https://github.com/facebook/jest/issues/2179

This is my first time digging into memory issues, so please forgive me if I am focusing on the wrong things!

Link to repl or repo

I have created a very minimal reproduction here: https://github.com/pk-nb/jest-memory-leak-repro. You should be able to run and see heap grow and also debug it with the chrome node devtools. With the reproduction, we can see this happens in both JSDOM and node environments in Jest.

screen shot 2018-08-07 at 7 43 02 pm screen shot 2018-08-07 at 6 49 44 pm

To Reproduce

Simply run a test suite with tests that require in a file that creates a closure over an imported variable:

// sourceThatLeaks.js

const https = require('https');

let originalHttpsRequest = https.request;

https.request = (options, cb) => {
  return originalHttpsRequest.call(https, options, cb);
};

// If this is uncommented, the leak goes away!
// originalHttpsRequest = null;
// 1.test.js, 2.test.js, ...

require("./sourceThatLeaks");

it("leaks memory", () => {});

While every worker leaks memory and will eventually run out, it is easiest to see with --runInBand.

Note that we are not doing anything with require to force a reimport—this is a vanilla require in each test.

When run with jasmine, we can see the issue go away as there is no custom require implementation for mocking code. We also see the issue disappear if we release the variable reference for GC by setting to null.

I believe the closure is capturing the entire test context (which also includes other imports like jest-snapshots) which quickly adds up.

Expected behavior

Hoping to fix so there is no leak. This unfortunately is preventing us from moving to Jest as we cannot run the suite on our memory bound CI (even with multiple workers to try to spread the leak).

I'm hoping the reproduction is useful—I spent some time trying to fix with some basic guesses at closures but ultimately am in over my head with the codebase.

You can see the huge closure in the memory analysis so I'm inclined to think it's some closure capture over the require implementation and/or the jasmine async function (promise).

screen shot 2018-08-07 at 6 52 50 pm screen shot 2018-08-07 at 4 45 32 pm screen shot 2018-08-07 at 7 13 23 pm

Some leak suspects:

These are educated guesses, but there are quite a few closures within the runtime / runner / jasmine packages though so it's very difficult (as least for me being new to the codebase) to pinpoint where the capture lies. I'm hoping that there's a specific point and that each closure in the runtime would not present the same issue.

Our suite

I have ensured the issue stems from Jest and not our suite—I ran the old suite (mocha) and saw a healthy sawtooth usage of heap.

Run npx envinfo --preset jest

▲ npx envinfo --preset jest
npx: installed 1 in 2.206s

  System:
    OS: macOS High Sierra 10.13.6
    CPU: x64 Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
  Binaries:
    Node: 8.11.3 - ~/.nodenv/versions/8.11.3/bin/node
    Yarn: 1.9.4 - ~/.nodenv/versions/8.11.3/bin/yarn
    npm: 5.6.0 - ~/.nodenv/versions/8.11.3/bin/npm
  npmPackages:
    jest: ^23.4.2 => 23.4.2

Please let me know if I can help in any way! I'd really love to get our company on Jest and am happy to help where I can. Thanks @lev-kazakov for the original isolation repro.

aaronabramov commented 6 years ago

@mjesun i think that's it. i ran into the same issue when updating jest internally at facebook

mjesun commented 6 years ago

Wow, that's an awesome report, @pk-nb! @aaronabramov can we use --detectLeaks against Jest and bisect to the commit?

arcanis commented 6 years ago

I'm not sure I follow this repro. You're doing this:

// sourceThatLeaks.js

const https = require('https');

let originalHttpsRequest = https.request;

https.request = (options, cb) => {
  return originalHttpsRequest.call(https, options, cb);
};

// If this is uncommented, the leak goes away!
// originalHttpsRequest = null;

It seems expected to have a leak, since you patch https.request, but never reset it. When I look from within Jest, the https module returned seems to be the actual one, not a mock, so each time this code is executed, the previous patch is repatched, causing the memory to grow more and more, the patches never getting released.

When I look with Jasmine, the memory does go up too, just much slower for two reasons:

I also tried your code on 23.0.0, and it seems to also reproduce the behavior, suggesting it always worked this way. Am I missing something?

pk-nb commented 6 years ago

Hi @arcanis —thanks for taking the time to look at this.

I will certainly admit the example given is not great code, but is an example of dependencies found in the wild that aim to mock or extend, including graceful-fs, node-agent, and supertest. I assume there are quite a few packages like this in the wild. We are just porting to Jest now and not comparing this to previous versions of Jest—I'm not sure what version @aaronabramov is updating from.

What's interesting is that I could not repro with the buffer suggestion you posted. Even though it is in the closure, it is being released in a sawtooth.

const https = require('https');

let buffer = Buffer.alloc(1000000000)

https.request = (options, cb) => {
  console.log(buffer.byteOffset, nodeBuffer.length);
  return cb(buffer.byteOffset);
};
screen shot 2018-08-13 at 12 54 02 pm

I updated suite 01 and 03 in this repro branch if you want to check. https://github.com/pk-nb/jest-memory-leak-repro/tree/external-buffer. I'm not sure why this would happen—maybe v8 can be more intelligent with optimizing out the call or something.

Ignoring the buffer—with respect, I think the Jest require system should aim to support all of the node require ecosystem, including these "singleton" packages that assume a require cache. I imaging this is tricky with the separate Jest environments but hopefully could be supported by cleaning / flushing the requires in Jest. Otherwise the above node packages end up unusable in Jest.

mjesun commented 6 years ago

We tried getting rid of the leak of modifying global objects in #4970, but that caused massive failures. Node has a very particular process of booting up, which includes a stateful ping-pong between the JS and the Native side, which made this impossible.

That said, as per @aaronabramov's comment, we know we've recently introduced a leak somewhere else too.

pk-nb commented 6 years ago

Thanks for the context @mjesun, didn't realize this had been attempted before. For anyone checking on the issue, here's the revert PR https://github.com/facebook/jest/pull/5009.

Maybe an obvious / bad idea, but could we simply replace the core module objects following each test to release any global modifications?

Something like:

coreModules.forEach(() => delete require.cache[key])

This in addition to setting all custom requires to null after a test. Not sure if the core modules have special handling that makes the cache invalidation not work. It's probably more complicated judging from that PR though.

arcanis commented 6 years ago

What's interesting is that I could not repro with the buffer suggestion you posted. Even though it is in the closure, it is being released in a sawtooth.

Because in this new code you're always replacing the previous hook:

https.request = (options, cb) => {
  console.log(buffer.byteOffset, nodeBuffer.length);
  return cb(buffer.byteOffset);
};

Since you overwrite https.request, the previous function gets garbage collected, and since the previous function gets garbage collected, the buffer it used gets released. To validate my theory I used a code similar to this:

let originalHttpsRequest = https.request;
let buffer = new Buffer(1024 ** 3);

https.request = (... args) => {
  return originalHttpsRequest.call(... args, buffer);
};
siluri commented 6 years ago

Thank you for taking care of the problem. Here's my repo on bug #6738 , I hope it helps to recreate the problem.

pk-nb commented 6 years ago

Wanted to leave an update—we found our accidental capture of http from the nock package. We accidentally were only calling cleanAll, which does not release the http modifications like nock.restore does. Calling this allowed our test suite to release the full set of requires in each suite (a lot of memory each suite).

I can close this if you'd like as this was a user issue (I hope this is a good google reference for others who made the same mistake we did). It's still a bummer that libraries that don't have a mechanism for releasing global modifications will be a blocker for others. I'm not sure what is actionable by Jest outside of moving to a environment reset per test suite (maybe by resetting / reimporting all built in modules?) or running in separate threads so that leaks can't continue.

SimenB commented 6 years ago

We have the same issue with graceful-fs not releasing fs. It would indeed be awesome tom somehow force drop or at least flag it when stuff like this happens

siluri commented 6 years ago

@pk-nb do u have a workaround for that?

kirillgroshkov commented 6 years ago

Facing the same issue with fs-extra, waiting for a good workaround/fix. Meanwhile downgrading to native fs module in problematic places.

markonyango commented 5 years ago

Seeing that graceful-fs fixed this in 4.1.12, is it somehow possible to roll out a hotfix for jest < 24.x as this is preventing my company from transitioning our test suite to jest from jasmine.

thymikee commented 5 years ago

We use graceful-fs@^4.1.11. 4.1.12 is in semver range of that, so you just need to clean install jest.

thymikee commented 5 years ago

BTW, can anybody confirm if this indeed fixes the issue and we can close it?

markonyango commented 5 years ago

@thymikee Thank you for the quick answer. Sadly it appears to not fix the issue. After ~1000 tests I start maxing out my remaining RAM which is about 10 - 12 GB. The trend of ever-increasing memory-usage is strictly linear which should be further proof that this is not a memory leak on the tests end.

SimenB commented 5 years ago

I'm pretty sure jest itself doesn't leak. Do you have some setup/teardown you run for every single test? Or are you able to put together a reproduction?

markonyango commented 5 years ago

It is an Angular project, so in most of the test files I use the TestBed to configure my testing module. Other than that we don't use any setup/teardown. Am I potentially not respecting any special rules for Angular unit tests with Jest?

describe('TestComponent', () => {
  let component: TestComponent;
  let fixture: ComponentFixture<TestComponent>;

  beforeEach(() => {
    const testHandlerStub = {};
    TestBed.configureTestingModule({
      imports: [TestingModule, StoreModule.forRoot(store)],
      declarations: [TestComponent],
      schemas: [NO_ERRORS_SCHEMA],
      providers: [{ provide: TestHandler, useValue: testHandlerStub }],
    });
    fixture = TestBed.createComponent(TestComponent);
    component = fixture.componentInstance;
  });

  it('can load instance', () => {
    expect(component).toBeDefined();
    expect(component).toBeTruthy();
  });

  describe('ngOnInit()', () => {
    it('triggers initializing', () => {
      const spy: any = spyOn(component.handler, 'init');
      component.ngOnInit();

      expect(spy).toHaveBeenCalled();
    });
  });
});

Most (70%) of our tests look somewhat like this.

lev-kazakov commented 5 years ago

@markonyango any non-native node module which decorates a native module will leak in jest. see https://github.com/facebook/jest/issues/6399#issuecomment-415083479 for details. graceful-fs and agent-base are 2 common libraries that do such things. i've created jest-leak-fixer some while ago in order to mitigate this issue. it is a hack. it patches graceful-fs and agent-base so not to leak. since graceful-fs have introduced a hack of their own you can fork jest-leak-fixer (or PR), remove the code that handles graceful-fs, and try running your test suite again. if this does not help, you can help us mapping other leaking modules by running jest-leak-fixer tests against your dependencies graph, one dependency at a time.

markonyango commented 5 years ago

@lev-kazakov Sadly, implementing the jest-leak-fixer according to your README did not solve the problem. I would help you running your tests against our dependencies graph if you could instruct me how to do it. If you could send me a message with instructions, I could help you there.

avaly commented 5 years ago

Possibly related to this: https://github.com/nock/nock/issues/1448

kirillgroshkov commented 5 years ago

any updates? We still face memory leaks from time to time, especially in a memory-constrained CI environment (leaks add up and crash at some point)

lev-kazakov commented 5 years ago

@mjesun can you help finding someone relevant to review https://github.com/facebook/jest/pull/8331? it's a kind of a hack that allows mitigating these kind of memory leaks, till we come up with something smarter to do.

stuart-clark-45 commented 4 years ago

For anyone else getting hit by this: I used @B4nan's pre test script ahead of running my tests and this seemed to stop the memory leak https://github.com/mikro-orm/mikro-orm/commit/5e848473cffafb5ae981344cbe1868b987b4871a

lifeiscontent commented 4 years ago

this issue has also been surfacing in GitHub Actions in this project here: https://github.com/lifeiscontent/realworld/pull/229/checks?check_run_id=826716910

Chnapy commented 4 years ago

In my team we have the same issue, causing our tests failing with a 'out or memory' error. After investigations it seems lib aws-xray-sdk to be the cause, comment any references to it fix the issue.

fazouane-marouane commented 4 years ago

We run our tests using plain old unix tools to avoid running into these memory leak issues. In case it helps someone:

It's a combination of find, sort and xargs. It finds all test files, sorts them randomly, then runs 4 instances of jest in parallel, each will test chunks of 5 files.

find ./tests -type f \( -name \*.test.ts -o -name \*.test.tsx \) | sort -R | xargs -L 5 -P 4 jest --runInBand --logHeapUsage
stuart-clark-45 commented 4 years ago

EDIT

ignore the below the issue is back :(


TLDR: using a NodeEnvironment was the problem.

My two cents on this, I was using a NodeEnvironment for my tests:

https://jestjs.io/docs/en/configuration#testenvironment-string

In here I was setting up the database and exposing through globals, dropping it etc. before each of my tests ran.

In the end I ditched this in favour of having a helper class which I just instantiate in the set up of each of my test files. This instance gets garbage collected without any issues.

Realise this probably isn't the most explicit comment but can't post any code sadly.

stuart-clark-45 commented 3 years ago

Ok so got a work around which I hope may be useful to others.

You can use this solution to run your tests in batches, one jest process per batch. The processes are run sequentially so should keep the memory usage down. Not ideal but functional and I needed a quick fix as this was issue was preventing us merging any PRs.

// run-tests-ci.ts

import {execSync} from "child_process";
import glob from "glob";

const shell = (cmd) => execSync(cmd, {stdio: "inherit"});

const run = async () => {
  logger.info(
    "Running tests in batches to avoid memory leak issue... this means you will see jest start up multiple times.",
  );

  const files = await glob.sync("./test/**/*.test.ts");

  const batchSize = 20;
  let batch: string[] = [];
  const runBatch = () => {
    if (batch.length) {
      shell(`jest ${batch.join(" ")}`);
      batch = [];
    }
  };

  for (const file of files) {
    batch.push(file);
    if (batch.length === batchSize) {
      runBatch();
    }
  }

  runBatch();
};

run().catch((err) => {
  logger.error(err);
  process.exit(1);
});

package.json

{
  "scripts": {
    "test-ci": "ts-node test/run-tests-ci.ts",
  },
// etc.
}

then run with npm run test-ci

elijahiuu commented 3 years ago

@stuart-clark-45 how do you combine the output from mulitiple --coverage's?

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 1 year with no activity. Remove stale label or comment or this will be closed in 14 days.

SimenB commented 2 years ago

I think as mentioned a few times, this is not a bug in Jest per say, but the way Jest works (its own module cache, different globals per test file etc.) can sometimes violate assumptions made by libraries when they patch/override/add to globals (either core modules or globalThis).

Not sure what we can do in Jest to detect this and warn, instead of just leak. One thing is #8331 for modules and potentially trying to add a Proxy around the global inside the scripts, but I'm not sure it's a particularly good idea. Much (probably by far most) modifications are perfectly safe to make, so there'd be a lot of false positives.

oskarols commented 2 years ago

@SimenB If it's innate to the implementation of Jest that a lot of larger code bases might run into these problems then I think it would be very helpful to actually have more details in the docs on why this would happen and common ways of debugging it (beyond the existing CLI args).

As it currently stands there's not a lot of detail on how the module loading works in the docs. And that's understandable in the sense that users "shouldn't have to know how it works". However if it is the case that having somewhat of an understanding of at least the general working of Jest can help users write more performant tests and understand why memory leaks can easy become an issue, then it's perhaps worth reconsidering.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 1 year with no activity. Remove stale label or comment or this will be closed in 30 days.

gentoo90 commented 1 year ago

This issue is stale because it has been open for 1 year with no activity. Remove stale label or comment or this will be closed in 30 days.

Oh no, you don't.

linhx commented 1 year ago

for the agent-base, I'm using jest.mock to mock the patch-core module. It's safer than replacing content since the original file may not be replaced back.

// package.json

    "setupFilesAfterEnv": [
      "./test/setup-after-env.ts"
    ],
// test/setup-after-env.ts
jest.mock('agent-base/patch-core', () => ({}));