cloudamqp / amqproxy

An intelligent AMQP proxy, with connection and channel pooling/reusing
https://www.cloudamqp.com
MIT License
354 stars 54 forks source link

AMQProxy keeps opening new connections #24

Closed Radiergummi closed 5 years ago

Radiergummi commented 5 years ago

Since two weeks ago, we experience a problem where the connections opened by our amqproxy instance won't be closed, resulting in more and more open connections, up to the point where our CloudAMQP instance runs out of memory, effectively shutting our application down: Screenshot_20190904_092654 The sharp drops are us deploying a new amqproxy version to Kubernetes, killing all connections during deployment (we're now on 0.4.0).

At this point, I'm not sure what exactly is happening. I suspect it's an issue with one of our services, but I've traced down every single commit in the relevant time frame and I couldn't find anything related to the lower-level queue components. Our setup looks like the following:

CloudAMQP <----> Google Kubernetes
                        |
                     amqproxy
                        |
            +-----+-----+-----+-----+
          srv1   srv2  srv3  srv4  srvN

The services all use the PHP AMQP Library with a super lightweight abstraction on top. After we've noticed the problem, we even implemented a shutdown handler to explicitly close the connection under all circumstances, but to no avail.

The connections all look like the following in the RabbitMQ manager:

Virtual host Name Node State SSL / TLS Protocol Channels Channel max Client From Client To Client Heartbeat Connected at
XXX 35.xxx.xxx.xxx:2113 rabbit@XXX-01 running AMQP 0-9-1 0 0 AMQProxy0.4.0 0iB/s 0iB/s   22:06:372019-09-03
XXX 35.xxx.xxx.xxx:32902 rabbit@XXX-01 running AMQP 0-9-1 0 0 AMQProxy0.4.0 0iB/s 0iB/s   08:37:022019-09-04
XXX 35.xxx.xxx.xxx:33072 rabbit@XXX-01 running AMQP 0-9-1 0 0 AMQProxy0.4.0 0iB/s 0iB/s   02:28:202019-09-04
XXX 35.xxx.xxx.xxx:33196 rabbit@XXX-02 running AMQP 0-9-1 0 0 AMQProxy0.4.0 0iB/s 0iB/s   00:38:142019-09-04
XXX 35.xxx.xxx.xxx:33248 rabbit@XXX-01 running AMQP 0-9-1 0 0 AMQProxy0.4.0 0iB/s 0iB/s   03:26:462019-09-04
XXX 35.xxx.xxx.xxx:33528 rabbit@XXX-01 running AMQP 0-9-1 0 0 AMQProxy0.4.0 0iB/s 0iB/s   17:20:152019-09-03

The AMQProxy logs contain lots and lots of entries like the following:

Error reading from upstream: #<Errno:Error reading socket: Bad file descriptor>

Is there anything we can do to prevent this excessive connection spawning?

carlhoerberg commented 5 years ago

can replicate, working on a fix

carlhoerberg commented 5 years ago

please try 0.4.1

Radiergummi commented 5 years ago

Hey Carl, thank you for the response. Sadly, this does not appear to fix the problem. We still see the bad file descriptor messages in the GCP logs, connection count seems to be rising again.
Please let me know if I can help you debugging this issue.

carlhoerberg commented 5 years ago

ok! new 0.4.2 is out

Radiergummi commented 5 years ago

Screenshot_20190912_134448

As the graph so nicely demonstrates, this has solved the problem! Awesome, thank you.
The only thing I notice is that there's still lots of log messages like these:

2019-09-11 16:17:37.370 CEST Error reading from upstream: #<Errno:Error reading socket: Bad file descriptor>
2019-09-11 16:17:42.434 CEST Connected to upstream x.x.x.x:5672
2019-09-11 16:18:09.296 CEST Proxy upstream: x.rmq.cloudamqp.com:5672 
2019-09-11 16:18:09.296 CEST Proxy listening on 0.0.0.0:5673
carlhoerberg commented 5 years ago

the log lines you're showing seems to suggest that the whole server crashed and then restarted. A new 0.4.3 is out and handles connection disruptions in another way. Please open a new issue if it still reports abruptly closed connections.