sematext / sematext-agent-java

Sematext Monitoring Agent
https://sematext.com/spm
Apache License 2.0
23 stars 9 forks source link

Agent should continue tyring to reconnect to DB after connection failure #66

Closed bsmid closed 4 years ago

bsmid commented 4 years ago

In case DB connection fails and recreating it also doesn't succeed, agent gets into a state where it continuously tries using the same old (broken) connection. Instead it should try to reconnect with longer retry intervals over longer period of time (and avoid using the old broken connection).

2019-11-20 06:14:30,946 ERROR [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Error while fetching data with com.sematext.spm.client.db.DbDataSourceBase for query SHOW /!50002 GLOBAL / STATUS;. Message: Communications link failure

The last packet successfully received from the server was 10,019 milliseconds ago. The last packet sent successfully to the server was 10,022 milliseconds ago. 2019-11-20 06:14:30,947 INFO [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Executing 'SHOW /!50002 GLOBAL / VARIABLES;' on db: 'jdbc:mysql://localhost:3306' 2019-11-20 06:14:34,950 ERROR [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Error while fetching data with com.sematext.spm.client.db.DbDataSourceBase for query SHOW /!50002 GLOBAL / VARIABLES;. Message: Could not create connection to database server. Attempted reconnect 3 times. Giving up. 2019-11-20 06:14:34,950 INFO [metrics-xxx:default:-1-thread] com.sematext.spm.client.StatsMetricsLogLineSender - Collectors collecting time: 4052, total time: 4052 2019-11-20 06:14:34,950 INFO [metrics-xxx:default:-1-thread] com.sematext.spm.client.StatsMetricsLogLineSender - Collectors count: 2, total collectors count: 23 2019-11-20 06:14:40,898 INFO [metrics-xxx:default:-1-thread] com.sematext.spm.client.MonitorConfig - GetCollectors() reload needed: false 2019-11-20 06:14:40,898 INFO [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Executing 'SHOW /!50002 GLOBAL / STATUS;' on db: 'jdbc:mysql://localhost:3306' 2019-11-20 06:14:40,898 ERROR [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Error while fetching data with com.sematext.spm.client.db.DbDataSourceBase for query SHOW /!50002 GLOBAL / STATUS;. Message: No operations allowed after connection closed. 2019-11-20 06:14:40,898 INFO [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Executing 'SHOW /!50002 GLOBAL / VARIABLES;' on db: 'jdbc:mysql://localhost:3306' 2019-11-20 06:14:40,898 ERROR [metrics-xxx:default:-1-thread] com.sematext.spm.client.db.DbDataSourceBase - Error while fetching data with com.sematext.spm.client.db.DbDataSourceBase for query SHOW /!50002 GLOBAL / VARIABLES;. Message: No operations allowed after connection closed.

bsmid commented 4 years ago

The best we can do is try to call connection.isValid() after error happens. In case this tells us the connection isn't valid anymore, we'll create a fresh connection.