aerospike / aerospike-client-nodejs

Node.js client for the Aerospike database
https://www.aerospike.com/
Apache License 2.0
201 stars 78 forks source link

Server error in Connect command: [9] #267

Closed industral closed 6 years ago

industral commented 6 years ago

Aerospike Community Edition build 4.3.0.2 (Docker)

$ docker exec -it db aql
Seed:         127.0.0.1
User:         None
Config File:  /etc/aerospike/astools.conf /root/.aerospike/astools.conf
Aerospike Query Client
Version 3.15.3.14
C Client Version 4.3.12
Copyright 2012-2017 Aerospike. All rights reserved.
aql> show namespaces;
+------------+
| namespaces |
+------------+
| "test"     |
+------------+
[127.0.0.1:3000] 1 row in set (0.002 secs)

OK

aql>
$ ncat localhost 3000 -v
Ncat: Version 7.70 ( https://nmap.org/ncat )
Ncat: Connected to ::1:3000.
^C
let client = Aerospike.client({
  hosts: [
    { addr: "127.0.0.1", port: 3000 }
  ],
  log: {
    level: Aerospike.log.TRACE
  }
})

client.connect((error) => {
  if (error) {
    console.error('Connection to Aerospike cluster failed!', error);
    process.exit(-1);
  }
});
npm run start

> api@1.0.0 start /path/to/my/code/api
> node --experimental-modules index.mjs

(node:24368) ExperimentalWarning: The ESM module loader is experimental.
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:181] [get_optional_uint32_property] - port => (uint32) 3000
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:184] [get_optional_uint32_property] - authMode => undefined
Aug 06 2018 16:56:16 UTC: TRACE(24368) [config.cc:82] [config_from_jsobject] - adding host, addr="127.0.0.1", port=3000
Aug 06 2018 16:56:16 UTC: DEBUG(24368) [config.cc:161] [config_from_jsobject] - Parsing global policies: success
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:85] [get_optional_string_property] - systemPath => "/path/to/my/code/api/node_modules/aerospike/modules/lua-core/src"
Aug 06 2018 16:56:16 UTC: DEBUG(24368) [config.cc:185] [config_from_jsobject] - Using default Lua user path: /usr/local/aerospike/client/usr/udf/lua
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:184] [get_optional_uint32_property] - connTimeoutMs => undefined
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:184] [get_optional_uint32_property] - loginTimeoutMs => undefined
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:184] [get_optional_uint32_property] - tendInterval => undefined
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:184] [get_optional_uint32_property] - maxConnsPerNode => undefined
Aug 06 2018 16:56:16 UTC: TRACE(24368) [conversions.cc:184] [get_optional_uint32_property] - maxConnsPerNodeSync => undefined
Aug 06 2018 16:56:16 UTC: DEBUG(24368) [client.cc:82] [New] - Aerospike client initialized successfully
Aug 06 2018 16:56:16 UTC: DEBUG(24368) [events.cc:52] [push] - Cluster event 0 triggered by node "BB9020015AC4202" (172.21.0.2:3000)
Aug 06 2018 16:56:17 UTC: DEBUG(24368) [command.cc:86] [ErrorCallback] - Server error in Connect command:  [9]
Aug 06 2018 16:56:17 UTC: DEBUG(24368) [command.cc:66] [Callback] - Executing JS callback for Connect command
Aug 06 2018 16:56:17 UTC: DEBUG(24368) [client.cc:116] [Close] - Closing the connection to aerospike cluster
Connection to Aerospike cluster failed! { AerospikeError
    at Function.fromASError (/path/to/my/code/api/node_modules/aerospike/lib/error.js:113:19)
    at as_client.connect (/path/to/my/code/api/node_modules/aerospike/lib/client.js:508:28)
    at Client.connect (/path/to/my/code/api/node_modules/aerospike/lib/client.js:507:20)
    at file:///path/to/my/code/api/db/index.mjs:32:8
    at ModuleJob.run (internal/modules/esm/module_job.js:96:12)
  name: 'AerospikeError',
  code: 9,
  command: null,
  func: '',
  file: '',
  line: 0,
  inDoubt: false }
npm ERR! code ELIFECYCLE
npm ERR! errno 255
npm ERR! api@1.0.0 start: `node --experimental-modules index.mjs`
npm ERR! Exit status 255
npm ERR!
npm ERR! Failed at the api@1.0.0 start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     ~/.npm/_logs/2018-08-06T16_56_17_447Z-debug.log

package.json -> "aerospike": "^3.5.0"

Connection is fine, but it has some issue with Server error in Connect command: [9].

P.S. Just checked with v3.4.0 - works fine. Looks like regression with nodejs-client v.3.5.0

jhecking commented 6 years ago

Status code 9 means timeout.

One thing you can do is to use the Aerospike.setDefaultLogging() function to enable trace logging, because that also enables the logs within the Aerospike C client library that's used within the Node.js client's native add-on. This might give us some further insight.

Another minor thing I noticed is, that netcat connects to the server on an IPv6 address ("::1") but the client attempts to connect on an IPv4 address ("127.0.0.1"). Can you try running ncat 127.0.0.1 3000 -v just to double check?

industral commented 6 years ago

Didn't find any changed with Aerospike.setDefaultLogging(Aerospike.log.TRACE);. Same output as above.

ncat 127.0.0.1 3000 -v
Ncat: Version 7.70 ( https://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:3000.
^C
jhecking commented 6 years ago

Didn't find any changed with Aerospike.setDefaultLogging(Aerospike.log.TRACE);. Same output as above.

The function takes the same kind of log parameter object, as you would pass to Aerospike.client(), i.e.:

Aerospike.setDefaultLogging({
  level: Aerospike.log.TRACE
})
industral commented 6 years ago

My bad. Yeah, works now. Not much output though:

Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:181] [get_optional_uint32_property] - port => (uint32) 3000
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - authMode => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [config.cc:82] [config_from_jsobject] - adding host, addr="127.0.0.1", port=3000
Aug 07 2018 07:10:25 UTC: DEBUG(42124) [config.cc:161] [config_from_jsobject] - Parsing global policies: success
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:85] [get_optional_string_property] - systemPath => "/path/to/my/code/api/node_modules/aerospike/modules/lua-core/src"
Aug 07 2018 07:10:25 UTC: DEBUG(42124) [config.cc:185] [config_from_jsobject] - Using default Lua user path: /usr/local/aerospike/client/usr/udf/lua
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - connTimeoutMs => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - loginTimeoutMs => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - tendInterval => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - maxConnsPerNode => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - maxConnsPerNodeSync => undefined
Aug 07 2018 07:10:25 UTC: DEBUG(42124) [client.cc:82] [New] - Aerospike client initialized successfully
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:166] [get_optional_int32_property] - maxCommandsInProcess => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - maxCommandsInQueue => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [conversions.cc:184] [get_optional_uint32_property] - queueInitialCapacity => undefined
Aug 07 2018 07:10:25 UTC: TRACE(42124) [policy.cc:46] [eventpolicy_from_jsobject] - Parsing event policy: success
Aug 07 2018 07:10:25 UTC: INFO (42124) [as_cluster.c:124] [as_cluster_add_nodes_copy] - Add node BB9020015AC4202 172.21.0.2:3000
Aug 07 2018 07:10:25 UTC: DEBUG(42124) [events.cc:52] [push] - Cluster event 0 triggered by node "BB9020015AC4202" (172.21.0.2:3000)
Aug 07 2018 07:10:26 UTC: DEBUG(42124) [command.cc:86] [ErrorCallback] - Server error in Connect command:  [9]
Aug 07 2018 07:10:26 UTC: DEBUG(42124) [command.cc:66] [Callback] - Executing JS callback for Connect command
Aug 07 2018 07:10:26 UTC: DEBUG(42124) [client.cc:116] [Close] - Closing the connection to aerospike cluster
jhecking commented 6 years ago

Just saw your update that the issue only affects v3.5.0 of the client. This helps narrow it down!

The issue seems similar to another issue we found in the Aerospike C client library earlier this year. I have reached out to our C client developer for help.

jhecking commented 6 years ago

As was pointed out by my colleague, the server seems to advertise a different access address, than the seed address you use on the client side:

Aug 07 2018 07:10:25 UTC: DEBUG(42124) [events.cc:52] [push] - Cluster event 0 triggered by node "BB9020015AC4202" (172.21.0.2:3000)

The server is advertising the address "172.21.0.2". Can you try using that address as the seed address when connecting to the cluster? If that does not work, you will have to change the access-address configuration on the server to use an address that's accessible to the client. (See General Network Configuration docs for reference.)

But you also pointed out that you do not see this issue with client v3.4.0. We suspect that this might be due to a bad interaction with a new feature introduced in C client v4.3.14 (included in Node.js client v3.5.0), which is to try to detect if the given seed node address points to a load balancer. We are still looking into this possibility.

jhecking commented 6 years ago

@industral, have you been able to resolve this issue?

jhecking commented 6 years ago

I'm going to close this issue at this point. Feel free to reopen, if you are still encountering issues.

ratheo commented 6 years ago

I'm also experiencing this issue after upgrading from 3.4. It happens when running the node.js app on a network that does not have access to the 'access-address'. On the aerospike server we've set the 'alternate-access-address' which is accessible by the node.js app, but it does not seem to be able to use that anymore?

jhecking commented 6 years ago

@ratheo, what server version and client version are you using?

Could you please also enable debug logs by calling

Aerospike.setDefaultLogging({ level: Aerospike.log.DEBUG })

before you connect the client. What address do you see in the "Cluster event triggered" message? Does it correspond to your access-address or alternate-access-address that you configure on the server side?

ratheo commented 6 years ago

@jhecking Using server: 'Aerospike Community Edition build 3.14.1.2' and client 3.7.0.

This is the log:

Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:181] [get_optional_uint32_property] - port => (uint32) 3000
Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:184] [get_optional_uint32_property] - authMode => undefined
Oct 02 2018 10:39:19 UTC: TRACE(99938) [config.cc:62] [config_from_jsobject] - setting seed hosts: "10.7.30.145:3000"
Oct 02 2018 10:39:19 UTC: DEBUG(99938) [config.cc:169] [config_from_jsobject] - Parsing global policies: success
Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:184] [get_optional_uint32_property] - connTimeoutMs => undefined
Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:184] [get_optional_uint32_property] - loginTimeoutMs => undefined
Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:184] [get_optional_uint32_property] - tendInterval => undefined
Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:184] [get_optional_uint32_property] - maxConnsPerNode => undefined
Oct 02 2018 10:39:19 UTC: TRACE(99938) [conversions.cc:184] [get_optional_uint32_property] - maxConnsPerNodeSync => undefined
Oct 02 2018 10:39:19 UTC: DEBUG(99938) [client.cc:82] [New] - Aerospike client initialized successfully
Oct 02 2018 10:39:19 UTC: INFO (99938) [as_cluster.c:125] [as_cluster_add_nodes_copy] - Add node BB93FCA2778E702 10.42.45.61:3000
Oct 02 2018 10:39:19 UTC: DEBUG(99938) [events.cc:57] [push] - Cluster event 0 triggered by node "BB93FCA2778E702" (10.42.45.61:3000)

But 10.42.45.61 (the access-address) is an ip inside a docker overlay network and not accessible by the client, it should be using the seed host: 10.7.30.145 (alternate-access-address)

jhecking commented 6 years ago

And just to double check: The only change you did in your environment was to upgrade the Node.js client from v3.4.0 (or v3.4.1) to v3.7.0, correct? The server version nor config did not change.

ratheo commented 6 years ago

That is correct, and if I downgrade back to 3.4 it works again.

jhecking commented 6 years ago

@industral, @ratheo, I've tracked down the cause of this issue. It's indeed due to the new load balancer detection logic introduced in C client v4.3.14 / Node.js client v3.5.0. The short version is that the Node.js client never fully supported the use of alternate-access-address. (Which is why I was confused when you said this worked in v3.4.0 for you.) But if you specified the servers alternate-access-address as the seed address, this kind of worked anyway - as long as your seed address list included the addresses of all your cluster nodes and you never added new nodes to your cluster dynamically.

I have implemented support for using the alternate access address in #283. If possible, can you try to build a client from that branch and then set the new config flag useAlternateAccessAddress to true and see if this works for you?

ratheo commented 6 years ago

That does indeed work 👍

jhecking commented 6 years ago

I just released Node.js client v3.7.1 which includes the fix for this issue.