brianc / node-pg-pool

A connection pool for node-postgres
MIT License
180 stars 64 forks source link

idleListener no longer grabs references to things it doesn't need #83

Closed lykkin closed 5 years ago

lykkin commented 6 years ago

Hello! I'm one of the maintainers of New Relic's Node.js agent, and I ran into an issue with our promise instrumentation using async_hooks along with this library. The promise instrumentation has to wait for some promises to be garbage collected before it can clear them as finished and drop the context it is holding onto for that promise. The issue we were observing was the idleListener closure was holding onto a reference to the callback passed into connect, which was holding onto a reference to a promise. This change only passes in the necessary data into the idleListener.

brianc commented 6 years ago

Thanks a lot for the PR as well as the description! I'll push a new patch version out right after CI passes.

lykkin commented 6 years ago

@brianc Thank you!

brianc commented 6 years ago

Looks like CI is failing for some reason on node v4.x

lykkin commented 6 years ago

Interesting, this passes locally for me.

sehrope commented 6 years ago

@lykkin said:

Interesting, this passes locally for me.

Haven't looked through this piece of the code but sounds like a race condition.

Here's the error from Travis:

TypeError: Cannot read property 'release' of undefined
      at Context.<anonymous> (test/bring-your-own-promise.js:20:12)
      at next (native)
      at onFulfilled (node_modules/co/index.js:65:19)

May want to check if the pool is removing the release() property on cleanup and it's being called later somewhere else.

lykkin commented 6 years ago

The issue was coming from connecting throwing an error, which rejected the promise and returned undefined here: https://github.com/brianc/node-pg-pool/blob/master/test/bring-your-own-promise.js#L13

It appears to be failing due to the different versions of postgres used. If I swap the version being used on node v4 to postgres 9.4 it passes. Though it should be noted that the first bring-your-own-promise test isn't connecting properly to the db. Is that intended? Also is there a reason the v4 tests are pinned to using postgres 9.1 instead of 9.4 like the node v6 tests?

Here is the test run on v4 using postgres 9.4: https://travis-ci.org/lykkin/node-pg-pool/jobs/302095878

lykkin commented 6 years ago

It appears that postgres 9.1 is no longer installed on the travis boxes, so I bumped the v4 tests to use 9.2 instead.

lykkin commented 6 years ago

@brianc Just checking in, is the postgres bump a good enough fix for the test failures, or would you rather the v4 tests install postgres 9.1?

Neamar commented 6 years ago

@brianc I've been seeing some leaks related to the combination of NewRelic and PG, any chance this can be merged and published to npm? Thanks!

zackarychapple commented 6 years ago

Hope this can get published, this is critical for resolving a memory leak with newrelic.

jsgithub1 commented 6 years ago

@brianc any update on when this may get merged? per the comments above there are a number of folks anxiously awaiting it...

danconnell commented 6 years ago

@brianc Can this please get merged?

greatjapa commented 6 years ago

Any news? We've facing this problem as well

yocontra commented 6 years ago

@brianc If you need additional collaborators for this repo I'm happy to help.

brianc commented 6 years ago

Sorry for the delay. I'm going to release a new version today w/ this fix. I've also added @charmander as a collaborator on this repo and made him a co-owner of the pg-pool npm module so hopefully I wont be a blocker in the future.

gabegorelick commented 6 years ago

Any reason this hasn't been merged yet? Let me know if there's anything the community can do to help.

danconnell commented 6 years ago

Can this please be merged? @brianc @charmander

charmander commented 6 years ago

There’s not much point in having a merge without a publish, so you don’t need to mention me =)

Dakkers commented 6 years ago

@charmander can you not get publish permissions? what's the point of being a collaborator without publish permissions?

charmander commented 6 years ago

@Dakkers repo maintenance

Nevon commented 6 years ago

@brianc: Any chance this is getting merged and released? Apparently charmander doesn't have the rights to publish new versions.

hartzis commented 6 years ago

@brianc it does look like you meant to give @charmander npm publish rights here -> https://github.com/brianc/node-pg-pool/pull/83#issuecomment-386686129

vikasj commented 6 years ago

eagerly waiting for this

ax850 commented 6 years ago

also eagerly awaiting for this.

danconnell commented 5 years ago

@brianc Since you're active on GitHub today, any chance this can get merged and released? Pretty please...

yocontra commented 5 years ago

@brianc Would you be open to putting node-postgres and related repos into a foundation? Would give you some extra support maintaining things.

charmander commented 5 years ago

Is it possible to write a test for this?

lykkin commented 5 years ago

@charmander Unfortunately I don't think that's possible. The only way I can imagine writing a test for this would be to use a WeakSet (since this won't stop the value that is being leaked from getting garbage collected), then to check if that set has the value you expect to be GC'd in it; however, this implies you have to hold onto a reference to the value to check against the set, stopping the value from being GC'd anyways.

If there is need I could add a native component to test this at the v8 level, but that sounds like more headache than it is worth to me.

charmander commented 5 years ago

@lykkin So async_hooks doesn’t need to be involved?

lykkin commented 5 years ago

@charmander Oh boy, sorry for the spam. Github was giving me no feedback except an availability error when posting.

To your question: No, this will leak anything that is captured in the lexical scope of cb in connect. The symptom we were seeing was a promise being captured by the closure passed into cb which was captured by the idle listener. The async_hooks part was specific to our agent.

lykkin commented 5 years ago

I would expect the test to go something like:

Capture a weak reference to an object declared in the same scope as the callback to pool.connect, and define a callback to flip a boolean that tracks the lifetime of the potentially leaked object (this is all in a native component). Call pool.connect and schedule a timer to expire to drop any local references to the potentially leaked object. In the callback to the timer, run a manual GC then check that the boolean on the native component has been flipped.

My only fear is it might not be worth the pain to deal with maintenance of a native module just for testing something that small.

Dakkers commented 5 years ago

ah dang, we missed the 1-year anniversary! 🎉

gabegorelick commented 5 years ago

A lot of people are eagerly awaiting this fix. I, along with many others in this thread, can confirm that this patch fixes the memory leak issue. While I sympathize with the desire to have a test case to prevent this regressing in the future, it doesn't seem like that should block the release of this fairly minor change.

brianc commented 5 years ago

Long overdue. Merging & releasing. Today.

gabegorelick commented 5 years ago

Thank you @brianc. I know dealing with these types of issues can be a pain.

brianc commented 5 years ago

You had a good point - there's diminishing returns with testing some things, particularly memory or performance related, in a CI way! Maybe in the glorious future it'll be easier. Sorry for taking so long to release this one....I pushed out a new version w/ this fix: pg-pool@2.0.6