Closed chandrashekar-s closed 7 years ago
We will provide a fix soon.
Fixed in 4.0.3
@BrianNichols Can you elaborate a little bit about the actual issue?
I was facing the same issue, I am Asynchronously writing a lot of data to Aerospike, but (say for some reason) I restart my app, and earlier data comes in too (with current stream of data) it was throwing the same error, Error Code -7. I think this has caused a data loss (I have not retried after failing once)
Also, I use to see a lot of Client Connections in AMC Dashboard for the AS Cluster Node even at times those connections were not required (say app didn't have data to put) i.e., idle scenario, for a significant time. That's why I am interested to know the actual issue.
Still getting the same error in version 4.0.6:
Error Code -7: Node BAA1963FFA06 172.1.2.1 3000 event loop 2 max connections 3000 would be exceeded. at com.aerospike.client.cluster.Node.getAsyncConnection(Node.java:572) at com.aerospike.client.async.NioCommand.executeCommand(NioCommand.java:127) at com.aerospike.client.async.NioCommand$1.run(NioCommand.java:647) at com.aerospike.client.async.NioEventLoop.registerCommands(NioEventLoop.java:219) at com.aerospike.client.async.NioEventLoop.runCommands(NioEventLoop.java:182) at com.aerospike.client.async.NioEventLoop.run(NioEventLoop.java:165)
I checked the code in getAsyncConnection method, fixed for the above code problem stated by @chandrashekar-s.
Is there any internal config which retries itself when the connection is not available or I should just retry it. Retrying by myself will not be efficient, as it can again create the same issue by retrying to get the connection.
All I can do is write in flat file (using log4j) and schedule a job to retry after some point. Any other method you know which can solve it better?
(flat file suggestion was for my app, not for AS Client)
Your problem is not related. The error means you are asking for too many connections on an event loop. One connection is needed for each async command. You are setting a max 3000 connections for a single event loop which is unusually high. Even still, you are exceeding that limit.
I suspect you are issuing a large number of commands to the event loop without any throttle. See http://www.aerospike.com/docs/client/java/usage/async for an example on how to limit the max number of concurrent commands to an event loop.
This way I can limit the number of commands, true. But my app is always receiving new things to write.
Anyways, now I don't think it will be an issue as earlier pool.total was not reduced when closing connection.
In case of failures I am writing to file and will schedule to put again with a limit. Thanks!
HI, I am using the Aerospike Java Client with version 4.0.1. I am calling Async APIs of AerospikeClient. In the below situation i see an error : "Caused by: com.aerospike.client.AerospikeException$Connection: Error Code -7: Node BB9F73E0B270008 172.28.128.3 3000 event loop 2 max connections 75 would be exceeded"
Situation Created 4 event loops and the 'maxConnsPerNode' default value is not changes which is 300. Connection time outs are the default values used which is 1 minute.
Initially when i submit around 200 Async operations, 200 connections are getting created which executes successfully. After a while, say 5 minutes during which the previous connections would have expired (1 minute is the default expire time which is not changed), if 200 more new Async commands are issued then the Aerospike client throws the error which is reported initially.
Ideally, since the old connections are expired 200 more connections should be created and executed. But instead results in the above error.
I had a look at the class - com.aerospike.client.cluster.Node,
public final AsyncConnection getAsyncConnection(int index, ByteBuffer byteBuffer) { AsyncPool pool = asyncConnectionPools[index];
ArrayDeque queue = pool.queue; AsyncConnection conn;
} Here i see that before returning a connection, it checks if any connections in the pool are valid then return them, else create a new connection and return. But the counter is not decremented correctly when closing the connections. Hence the connections are not tracked correctly and results in the situation which i have described.
Can you please fix this (If already fixed let me know the fixed version)? Let me know if any further information is needed.
Thanks, Chandrashekar