Closed MachinesAreUs closed 5 years ago
Environment:
Hmm. Thank you for the detailed report, and I am sorry for the troubles. I’ll have to find the time to dedicate to this issue, but as a quick step forward, can you please try the driver code from the master branch?
For some reason when I tried to update to 0.5 (and failed because it isn't on hex.pm and I didn't read about the dependency on db_connection/master issue), I assumed we were already using the db_connection pooling. Wrong. I've just switched to master and it appears to work fine on a first round.
I'll keep you informed.
Thanks for the hint!
This is good news, thanks a lot for confirming the master branch! Please keep me tuned, as we we’ll refactor the driver and add the bolt+routing protocol in addition to the bolt only one.
I’ll still look at the issue you reported!
Well, we made some tests and deployed our application using the version in master. It's been more than 12 hr since then and everything appears to work alright 👍
W⦿‿⦿t!
My team and I found
bolt_sips
has two apparently wrong behaviors regarding to connection pooling.timeout
parameter only applies to the connection between the driver and the neo4j server, but it doesn't apply to the client application requests.You can try by yourself starting this minimal application. Just change the query/queries you want to execute in
BoltSipsLoad
.After cloning, and deps compilation, try:
This would repeat 5 times launching 1 process to execute a query and then will wait 500ms for the next iteration. In another terminal you can check for the number of opened sockets to the neo4j server (substitute pid for watherver the process id of your beam is):
5 sockets to the bolt socket, as expected.
Now something more interesting. Let's make 10 iterations, launching 100 processes in each one.
What?!
This is confirmed in the observer application. Look at all those
tcp_inet
portsAnd each one of them is a port to the neo4j bolt port.
Unfortunately this is causing trouble in a production system that was just handed to me. Increasing the limit of file descriptors the process can open just moves the problem somewhere else, because the neo4j server can't handle thousands of connections without getting into performance problems. In any case, there shouldn't be there as many open connections, that's what the pool is intended for, isn't it?
It would be great to get your confirmation/feedback about this issue.