disaster37 / rancher-backup

MIT License
45 stars 24 forks source link

Rancher crashes at rancher database dump #28

Open snics opened 6 years ago

snics commented 6 years ago

Description of error Songs Rencher always crashes when creating a rancher database dump. The dump will then not be completed and rancher will stop with the database migration and won't start anymore. Only after a Manuel restart of the MariaDB and Rancher rancher works again.

Additional information MariaDB: Docker v10.3 Rancher Server: last stable version

amarburg commented 6 years ago

I can corroborate this with bitnami/mariadb:10.1 and rancher/server:v1.6.18

Here is the stack dump in the rancher image docker log:

As I read it, this is a timeout connecting to the mysql database while doing some other unrelated housekeeping (cluster membership check?)

FATAL: Exiting due to failed cluster check-in 2018-07-20 15:45:01,014 ERROR [pool-4-thread-1] [ConsoleStatus] Check-in failed org.jooq.exception.DataAccessException: SQL [update cluster_membersh ip set cluster_membership.heartbeat = ? where cluster_membership.uuid = ?]; (conn:13) Connection timed out at org.jooq.impl.Utils.translate(Utils.java:1287) at org.jooq.impl.DefaultExecuteContext.sqlException(DefaultExecuteContext.java:495) at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:326) at org.jooq.impl.AbstractDelegatingQuery.execute(AbstractDelegatingQuery.java:140) at io.cattle.platform.hazelcast.membership.dao.impl.ClusterMembershipDAOImpl.checkin(ClusterMembershipDAOImpl.java:32) at io.cattle.platform.hazelcast.membership.DBDiscovery.checkin(DBDiscovery.java:174) at io.cattle.platform.hazelcast.membership.DBDiscovery.doRun(DBDiscovery.java:163) at org.apache.cloudstack.managed.context.NoExceptionRunnable.runInContext(NoExceptionRunnable.java:15) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.sql.SQLNonTransientConnectionException: (conn:13) Connection timed out at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:137) at org.mariadb.jdbc.internal.util.ExceptionMapper.getException(ExceptionMapper.java:101) at org.mariadb.jdbc.internal.util.ExceptionMapper.throwAndLogException(ExceptionMapper.java:77) at org.mariadb.jdbc.MariaDbStatement.executeQueryEpilog(MariaDbStatement.java:224) at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeInternal(MariaDbServerPreparedStatement.java:411) at org.mariadb.jdbc.MariaDbServerPreparedStatement.execute(MariaDbServerPreparedStatement.java:361) at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) at org.jooq.tools.jdbc.DefaultPreparedStatement.execute(DefaultPreparedStatement.java:194) at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:376) at org.jooq.impl.AbstractStoreQuery.execute(AbstractStoreQuery.java:289) at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:322) ... 17 common frames omitted Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Connection timed out at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.throwErrorWithQuery(AbstractQueryProtocol.java:960) at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executePreparedQuery(AbstractQueryProtocol.java:591) at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeInternal(MariaDbServerPreparedStatement.java:399) ... 24 common frames omitted Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could not read packet: Read timed out Query is: update cluster_membership set cluster_membership.heartbeat = ? where cluster_membership.uuid = ?, parameters [1532101380945,'005b8c 3c-492d-407f-bdfe-727447983486']