brianc / node-libpq

Simple, low level native bindings to PostgreSQL's libpq from node.js
111 stars 41 forks source link

Fix Node.js 10, 11 and 12 compatibility #67

Closed darkgl0w closed 5 years ago

darkgl0w commented 5 years ago

Hello ^^

This PR should at least :

Feel free to ping me if changes needs to be made in order to merge this PR. ^^

Paxa commented 5 years ago

Any idea why it give Segmentation fault for node 12? It works on mac

Equals182 commented 5 years ago

I think I got it working for me and I don't have any errors building libpq on node:12-alpine.

When you look at Travis' job 184.5 for Node 12, you may see Segmentation fault there, but I think it's not the root of the problem. You may also notice that 'MakeCallback' is marked as depricated and I think this is the problem.

So i replaced deprecated method on line 804 in connection.cc with this code:

Nan::AsyncResource *async_resource = new Nan::AsyncResource("libpq:connection:emit");
async_resource->runInAsyncScope(handle(), emit_f, 1, info);

And error's gone.

Now keep in mind, that I'm in no way a C++ developer, I don't have any experience in nodejs/nan, but I just thought maybe you'll find my 'solution' useful 🤷‍♂

darkgl0w commented 5 years ago

@Paxa > I need to find the time to investigate this. @Equals182 > MakeCallback is marked as deprecated but it works, that's not the problem it just emits a warning.

Currently I got the branch to compile and pass all the tests successfully in my local machines (in all node versions starting from v4 to v12 on Windows, Mac OS and Linux), the segmentation fault occurs only in Travis and I can't reproduce it in all my environments.

I opened a Github repo in order to try to start a Travis debug build environment. Maybe I should be able to find something useful with lldb.

If you want to give it a try you can clone this one https://github.com/darkgl0w/experiment-node-libpq-fork

Paxa commented 5 years ago

I got some progress using https://github.com/ddopson/node-segfault-handler

It fails when using openssl, with PGSSLMODE=disable works fine. Updating to newer linux & openssl & libpq doesn't help https://travis-ci.org/Paxa/node-libpq/builds/542443882

darkgl0w commented 5 years ago

Good work @Paxa !

This is interesting, so I guess it's somewhat related to Node.js upgrade of openssl sources to 1.1.1b (as you can see, the CI didn't fail when using Node.js 10.15.3 and it began to fail when running tests on Node.js 10.16.0).

By default PostgreSQL first try an SSL connection and if that fails, try a non-SSL connection. Setting PGSSLMODE=allow will first try a non-SSL connection and if that fails, try an SSL connection.

There must be something else we are missing seen that I can't reproduce this fail on my dev environments even with the default PGSSLMODE=prefer.

@brianc > WDYT is it ok to you ?

Paxa commented 5 years ago

I added console to print error message, may be we should change assert to see error message

// from
assert(!err);
// to
assert.equal(err, null);

Strange thing for me is that between libpq and libssl there is node/v10.16.0/bin/node(SSL_new...) I though it should call libssl directly but it find function from nodejs

/home/travis/build/Paxa/node-libpq/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x299c)[0x7f98e821e99c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f98e97fe390]
/lib/x86_64-linux-gnu/libssl.so.1.0.0(+0x243ea)[0x7f98e236f3ea]
/lib/x86_64-linux-gnu/libssl.so.1.0.0(+0x21f26)[0x7f98e236cf26]
/lib/x86_64-linux-gnu/libssl.so.1.0.0(+0x2c619)[0x7f98e2377619]
/home/travis/.nvm/versions/node/v10.16.0/bin/node(SSL_new+0x49f)[0x138590f]  <<<<<
/usr/lib/x86_64-linux-gnu/libpq.so.5(+0x250bc)[0x7f98e25d90bc]
/usr/lib/x86_64-linux-gnu/libpq.so.5(+0x26915)[0x7f98e25da915]
/usr/lib/x86_64-linux-gnu/libpq.so.5(PQconnectPoll+0xcb0)[0x7f98e25c2650]
/usr/lib/x86_64-linux-gnu/libpq.so.5(+0xf4fc)[0x7f98e25c34fc]
/usr/lib/x86_64-linux-gnu/libpq.so.5(PQconnectdb+0x27)[0x7f98e25c6267]
/home/travis/build/Paxa/node-libpq/build/Release/addon.node(_ZN10Connection9ConnectDBEPKc+0x12)[0x7f98e8013fa2]

As I see nodejs uses openssl 1.1.1b, but libpq https://travis-ci.org/Paxa/node-libpq/jobs/542717943libssl.so.1.0.0

After digging deeper, turns our nodejs comes with statically linked openssl (for node 10+ it's openssl 1.1), and libpq is linked to shared openssl 1.0.0 (in case of ubuntu and debian in travis) and then they collide on same method name.

I see few possible solutions:

  1. Compile node-libpq with static linking (not sure if possible)
  2. Check at compile time or run time that both node and node-libpq uses openssl 1.1 or both uses 1.0
  3. Any ideas?
Paxa commented 5 years ago

Btw I prefer to have proper solution, so other people can use node-libpq and on their servers with different versions of libs (without disabling openssl)

Flarna commented 5 years ago

As far as I know node has it's own ssl version included and exports it's ssl (and also zlib) symbols. If the ssl version needed/expected by this addon differs in ABI from that one exported by node its likely a problem. Maybe statically linking ssl and not exporting symbols ('cflags': ['-fvisibility=hidden']) could help.

Paxa commented 5 years ago

Related https://github.com/nodegit/nodegit/issues/1490 and https://github.com/nodejs/node-gyp/issues/1559

Paxa commented 5 years ago

Other option - build libpq from source linking to openssl 1.1 and may be ship precompiled binaries with https://github.com/mapbox/node-pre-gyp Any idea how to make it detect if existing libpq.so is linked to openssl 1.1 and recompile only when necessary?

Paxa commented 5 years ago

I managed to install local version of openssl and libpq in CI and it works https://travis-ci.org/Paxa/node-libpq/builds/542945099 but ldd still show it uses system library, still not sure how to make it always choose custom built libs... (without setting LD_LIBRARY_PATH)

Overall for this problem, need to make decision how to handle this openssl version conflict, I hope we can do better then segfault. Not sure what will be better for users, may be always disable libpq ssl if openssl version mismatch or throw error when trying to connect with ssl enabled, or refuse to compile and suggest users how to solve it (install libpq with libssl 1.1) or make it use precompiled binaries when mismatch detected (then it may be different then local pg_config settings)

mcollina commented 5 years ago

Any updates on this?

darkgl0w commented 5 years ago

Hello ^^

Awesome work @Paxa ! I took a look at your repo and so far it helped me narrow down the problem to this specific test and this specific piece of code where the lib struggles when connection params are hard-coded.

Regarding your solution it appears that it's a placebo effect triggered by setting PGHOST=/var/run/postgresql and it's the same when setting PGSSLMODE=allow. you can see a similar result here. It can work as a workaround to pass Travis CI tests but the source of this segfault will remain :/

TBH for now I don't have a proper solution to propose even after more than a hundred attempts :/

So the work still goes on this and tomorrow I will try to spot the diff between my environments and Travis, maybe I will figure out what's the difference. ^^

NB: there is also a strange difference between the CI running here and in ours forked repos: here is mine.

Paxa commented 5 years ago

As I understand - the problem not with specific function or env variable but with conflict of native functions. Libpq uses SSL_new function from openssl, but nodejs already have SSL_new function inside itself. So libpq mistakenly call SSL_new function inside node. (builtin openssl)

Even if we make it somehow call right function, other extensions may get error because we have openssl 1.0 and openssl 1.1 in memory at same time. The proper solution would be to use libpq linked to same major openssl version as nodejs (e.g. both 1.0.x or both 1.1.x) On some linux distros it will require to compile libpq from source (e.g. ubuntu in travis-ci)

darkgl0w commented 5 years ago

Hello,

@Paxa > I finally made it work, it's a bit heavy on the CI due to the fact that it needs to compile OpenSSL 1.1.1b and libpq from sources on Node.js >=10.16.0 but now it works as it should despite the aging environment of Travis ^^'.

@brianc > I think it's all good now, can you take a look at this PR in order to merge it as fast as possible ?

Feel free to ping me if you find something that needs to be changed. ^^

Paxa commented 5 years ago

Do you think we should detect if openssl versions mismatch and show warning in run time or compile time? IMO people don't read docs until it's broken

darkgl0w commented 5 years ago

IMO people don't read docs until it's broken

Yeah that's not a bad idea :rofl: Maybe we can implement a binding.gyp action WDYT ? Do you have a suggestion in mind @Paxa ? If it can prevent the build to start and show a big warning it will be better IMO. :d

brianc commented 5 years ago

This is incredible! I'll review this over the weekend as it will likely take some time.

brianc commented 5 years ago

Update: was my b-day this weekend (🍰) and had less time than I thought. Tomorrow morning I'll come into the office early & take a look at this. ❤️

Paxa commented 5 years ago

Happy birthday @brianc!

brianc commented 5 years ago

Thanks @Paxa! 😄

@darkgl0w - this is so awesome...thanks for doing this! I honestly touch C/C++ these days so little that every time I have to I get a bit scared...so this was like a little birthday present from you to me! 🎁 I really appreciate the help here! If you have any interest in becoming a maintainer of the native bindings for node-postgres please don't hesitate at any time to get in touch...would love for someone w/ more up to date & polished C/C++ to help do some housekeeping like this here! ❤️

brianc commented 5 years ago

@darkgl0w - question...pg-native (and subsequently node-postgres) are failing travis w/ a segfault now...

https://travis-ci.org/brianc/node-pg-native/jobs/550326804#L524

Do you think it's related to needing to make sure the proper version of libssl is installed there as well?

Paxa commented 5 years ago

Definitely. If we push code like this it will break all depending projects on travis-ci (travis-ci uses SSL by default) and probably some in production. (on dev computer we usually don't use SSL for db connection but in production we may)

Paxa commented 5 years ago

How can we verify if '-fvisibility=hidden' works?

Flarna commented 5 years ago

How can we verify if '-fvisibility=hidden' works?

I think you can use nm to check for exported symbols, e.g. nm -D <XXX.node> | grep " [^U] <pattern of symbols which should not be exported, e.g. OPENSSL.*>".

darkgl0w commented 5 years ago

Hello ^^

@brianc > yes that's definitely related.

From what I saw when I was working on this the real culprit here is libpq that absolutely must be compiled with an OpenSSL version compatible with the one bundled by Node.js.

It's kind of annoying and heavy on the CI as the only "workaround" I found for Travis is those extra step to compile OpenSSL first and then libpq.

DimaNazdratenko commented 4 years ago

I still have the problem. I used pg-native in the pg-promise options. It worked well on node <10.15.3, but when I upgraded node to >=10.16.0 I have error segmentation fault. Please help me guys @Paxa @brianc @darkgl0w https://github.com/brianc/node-pg-native/issues/87#issue-574079207