tigase / tigase-server

(M) Highly optimized, extremely modular and very flexible XMPP/Jabber server
https://tigase.net
GNU Affero General Public License v3.0
326 stars 106 forks source link

problem with HA on AWS aurora mysql for DNS cache #196

Closed davidemarrone closed 3 months ago

davidemarrone commented 1 year ago

Issue migrated to: https://tigase.dev/tigase/_server/server-core/~issues/1496


Problem with DNS cache I was testing the AWS Aurora mysql failover procedure with tigase, reading the documentation I have found: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.MultiAZSingleStandby.html scrolling to: "Setting the JVM TTL for DNS name lookups"

explains that is important to set a TTL for the java cache otherwise the failover does not work properly.

I was testing it on an instance of tigase and during my test I have found that tigase does not auto recover, after triggering a failover manually, in the logs there is:

[2023-04-21 16:39:01:728] [WARNING ] [         cluster-nodes ] ClConSQLRepository.removeItem()  : Problem removing elements from DB: 
java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3978)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3914)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2530)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2495)
    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1903)
    at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2124)
    at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2058)
    at com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5158)
    at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2043)
    at jdk.internal.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at tigase.db.jdbc.PreparedStatementInvocationHandler.invoke(PreparedStatementInvocationHandler.java:38)
    at com.sun.proxy.$Proxy35.executeUpdate(Unknown Source)
    at tigase.cluster.repo.ClConSQLRepository.removeItem(ClConSQLRepository.java:137)
    at tigase.cluster.repo.ClConConfigRepository.itemLoaded(ClConConfigRepository.java:178)
    at tigase.cluster.repo.ClConSQLRepository.reload(ClConSQLRepository.java:221)
    at tigase.db.comp.ConfigRepository$1.run(ConfigRepository.java:81)
    at java.base/java.util.TimerThread.mainLoop(Timer.java:556)
    at java.base/java.util.TimerThread.run(Timer.java:506)

this means that is still using the old endpoint of the DNS and it not switching to the new endpoint, I have left over 10 minutes always printing the same message.

Restarting the server resolves the problem but I need an automatic recovery.

I have change on the system in /etc/java-11-openjdk/security/java.security the value for networkaddress.cache.ttl as suggested in the AWS doc but I have the same results. There is any other way to set this parameter? Do you know why tigase do not consider the system configuration?

System info:

davidemarrone commented 1 year ago

just FYI I have discovered that the AWS aws-mysql-jdbc is almost deprecated:

https://github.com/awslabs/aws-mysql-jdbc/blob/main/RELEASE_POLICY.md

New features will not be added to the aws-jdbc-driver going forward. Future development will be taking place in the aws-advanced-wrapper driver. The aws-mysql-jdbc project follows the semantic versioning specification for assigning version numbers to releases. We recommend to adopt the new wrapper driver before the maintenance window of current version ends on July 25, 2024.

And I also found out that anyway it doesn't support pools with long lived connections. But the new aws-jdbc-driver will support pools with long lived connections, to use it with tigase anyway should be needed a way to specify the AWS JDBC Driver protocol prefix / DriverManager that now is hardcoded in tigase

davidemarrone commented 3 months ago

@woj-tek is this problem has been fixed with Tigase Server 8.4 ?

woj-tek commented 3 months ago

Hi,

Issue migrated to: https://tigase.dev/tigase/_server/server-core/~issues/1496

Let's continue there