Closed jdanilson closed 5 months ago
This is exactly what should happen. Error log should show records of all the actions performed, including killing all the connections to the failed node.
to clean all connections from the failing node when it returns to service
I am not sure why you ask to clean the connections when it returns to service . The connections are killed as soon as the server is detected as faulty.
Rene, Thanks for your comments.
We are enthusiastic users of Proxysql with over 150 hosts running the software. For the most part it does exactly what we want. Perhaps due to our implementation method we face a continuing problem,
We started using proxysql over two years ago to front our Percona Galera cluster nodes. The first two attempts at deployment failed as we could never get the applications to process (using java or wildfly) without errors. Our dev teams and Percona were unable to track down the reasons for the errors and we gave up. After some further trails we developed a deployment that works 99% of the time. Our deployment for our pxc galera clusters consists of: a bigip vip for reads and a separate ip for writes. The application only connects using the vip.
This configuration allowed us to deploy proxysql with zero changes to the application. The only change made after deploying proxysql on each db node was to change the write vip to send traffic to port 3307 rather than port 3306. The pxc server continues to listen on 3306. It works and has worked well. No one we've talked to likes the co-location but we are forced to do this because we cannot get other hosts to dedicate to proxysql nor can we locate proxysql on the application hosts. (These constraints are not technical, they are bureaucratic and due to the nature of how the government works. I cannot change this.)
Our problem occurs when our os teams patch the hosts which they do every 90 days in the middle of the night. (This interval is about to decrease to 30 days.) Here is a progression of our issue:
pre-patch state host 1 + proxysql1 + db 1 <—- writes are here. reads too are here. host 2 + proxysql 2 + db2 <— reads are here host 3 + proxysql 3 + db3 <—- reads are here.
patching commences
host is shutdown for patching. This works well and our failover has been consistent.
host 1 + proxysql1 + db 1 down. no traffic
host 2 + proxysql 2 + db2 <— writes are here. reads too are here
host 3 + proxysql 3 + db3 <—- reads are here.
host 1 restarts. Both proxysql and pxc are enabled for automatic restart. host 1 + proxysql1 + db 1 <— identifies itself as primary writer; connections begin host 2 + proxysql 2 + db2 <— writes which are present remain. host 3 + proxysql 3 + db3 <—- reads are here.
We have found the way out of this is to
Neither solution is optimal.
Hence our request; if you fail over stick there! Or alternatively, you know connections are now on the primary writer of db1 so kill the ones on db2.
Perhaps we are doing something wrong and you will show us the error of our ways. Percona has written a new monitoring script which they have suggested and we are investigating.
Hi @jdanilson,
thanks for sharing your scenario here too, as I mentioned also in my comment in issue #3670, the reason for the current behavior, and for why the 'stickiness' after a failover isn't a configuration option is documented as part of the FAQ. It's also now stated in the FAQ that if a custom monitoring like this is required, users can always resort to the scheduler for implementing their own monitoring solution.
It's also mentioned as part of the Galera Cluster monitoring documentation:
Since this is documented, and there are reasons for this option not to exist, I'm proceeding to close the issue.
Thanks again for sharing your scenario, and being of part of the community. Regards, Javier.
When proxysql detects an issue on a percona galera node and fails over to another node is it possible to clean all connections from the failing node when it returns to service. We want to prevent any possibility of writes on more than one galera node.