zxs / tungsten-replicator

Automatically exported from code.google.com/p/tungsten-replicator
0 stars 0 forks source link

Trepctl status fails when DBMS is stopped or login/password is invalid #1043

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Set up master/slave replication using any version of MySQL. 
2. Stop MySQL on the master using e.g., /etc/init.d/mysql stop
3. Run 'trepctl status'

What is the expected output?

Trepctl status should provide output showing that the replicator is offline. 

What do you see instead?

$ trepctl status
Processing status command...
Operation failed: Status operation failed: Unable to connect to DBMS: 
url=jdbc:mysql:thin://logos1:3306/tungsten_repltest user=tungsten

What is the possible cause?

Due to data source changes this operation now throws a ReplicatorException 
instead of the previous ExtractorException.  The code that processes the status 
call does not catch it:

                try
                {
                    currentEventId = headExtractor.getCurrentResourceEventId();
                    if (currentEventId != null)
                        statusProps.setString(Replicator.CURRENT_EVENT_ID,
                                currentEventId);
                }
                catch (ReplicatorException e)
                {
                    statusProps.setString(Replicator.CURRENT_EVENT_ID, "ERROR");
                    if (logger.isDebugEnabled())
                        logger.debug("Unable to get current resource event ID",
                                e);
                }

What is the proposed solution?

Catch ReplicatorException instead of ExtractorException. 

Additional information

This can cause replicators to appear to be stopped for monitoring programs, so 
it is disconcerting. 

Use labels and text to provide additional information.

Original issue reported on code.google.com by robert.h...@continuent.com on 23 Oct 2014 at 3:36

GoogleCodeExporter commented 9 years ago
Note!  This problem does not occur if the replicator is offline.  However it 
will occur under the following circumstances if the replicator is currently 
ONLINE and: 

1.) The DBMS stops
2.) The tungsten login/password change so that new connections no longer work
3.) The DBMS stops accepting new connections for example because it exceeds the 
connection limit

#3 was a serious problem that we correct many years ago.  It led to a nasty 
problem in deployed systems. 

Original comment by robert.h...@continuent.com on 23 Oct 2014 at 3:48

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r2647.

Catch ReplicatorException (instead of ExtractorException) so that 'trepctl 
status' does not fail when server is not available. 

Original comment by robert.h...@continuent.com on 25 Oct 2014 at 8:22

GoogleCodeExporter commented 9 years ago
Corrected as described in initial report and ready for QA.  You can test this 
using any of the cases listed above. 

Original comment by robert.h...@continuent.com on 25 Oct 2014 at 8:23

GoogleCodeExporter commented 9 years ago
When the DBMS is down, we get this status:

/trepctl status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : mysql-bin.000004:0000000000005055;88
appliedLastSeqno       : 23
appliedLatency         : 0.66
autoRecoveryEnabled    : false
autoRecoveryTotal      : 0
channels               : 1
clusterName            : cookbook
currentEventId         : ERROR
currentTimeMillis      : 1414384149650
dataServerHost         : ubuntu1
extensions             :
host                   : ubuntu1
latestEpochNumber      : 2
masterConnectUri       : thl://localhost:/
masterListenUri        : thl://ubuntu1:2112/
maximumStoredSeqNo     : 23
minimumStoredSeqNo     : 0
offlineRequests        : NONE
pendingError           : NONE
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: NONE
pipelineSource         : /var/lib/mysql
relativeLatency        : 47.65
resourcePrecedence     : 99
rmiPort                : 10000
role                   : master
seqnoType              : java.lang.Long
serviceName            : cookbook
serviceType            : local
simpleServiceName      : cookbook
siteName               : default
sourceId               : ubuntu1
state                  : ONLINE
timeInStateSeconds     : 109.644
transitioningTo        :
uptimeSeconds          : 692.168
useSSLConnection       : false
version                : Tungsten Replicator 3.0.0 build 491
Finished status command...

(Notice the "ERROR" instead of the event ID)

However, as soon as I  commit anything to the database, the status changes to 
this:

trepctl status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : NONE
appliedLastSeqno       : -1
appliedLatency         : -1.0
autoRecoveryEnabled    : false
autoRecoveryTotal      : 0
channels               : -1
clusterName            : cookbook
currentEventId         : NONE
currentTimeMillis      : 1414384267205
dataServerHost         : ubuntu1
extensions             :
host                   : ubuntu1
latestEpochNumber      : -1
masterConnectUri       : thl://localhost:/
masterListenUri        : thl://ubuntu1:2112/
maximumStoredSeqNo     : -1
minimumStoredSeqNo     : -1
offlineRequests        : NONE
pendingError           : Stage task failed: stage=q-to-thl seqno=24 fragno=0
pendingErrorCode       : NONE
pendingErrorEventId    : mysql-bin.000005:0000000000000274;40
pendingErrorSeqno      : 24
pendingExceptionMessage: Unable to update last commit seqno: Could not read 
resultset: Incomplete read! Expected 4, got -1
pipelineSource         : UNKNOWN
relativeLatency        : -1.0
resourcePrecedence     : 99
rmiPort                : 10000
role                   : master
seqnoType              : java.lang.Long
serviceName            : cookbook
serviceType            : unknown
simpleServiceName      : cookbook
siteName               : default
sourceId               : ubuntu1
state                  : OFFLINE:ERROR
timeInStateSeconds     : 2.586
transitioningTo        :
uptimeSeconds          : 809.723
useSSLConnection       : false
version                : Tungsten Replicator 3.0.0 build 491
Finished status command...

mysql -e 'select * from tungsten_cookbook.trep_commit_seqno\G'
*************************** 1. row ***************************
          task_id: 0
            seqno: 23
           fragno: 0
        last_frag: 1
        source_id: ubuntu1
     epoch_number: 2
          eventid: mysql-bin.000004:0000000000005055;88
  applied_latency: 0
 update_timestamp: 2014-10-27 04:28:22
         shard_id: test
extract_timestamp: 2014-10-27 04:28:22

Here's an excerpt from the replicator log:

INFO   | jvm 1    | 2014/10/27 04:27:20 | 2014-10-27 04:27:20,457 [cookbook - 
ConnectorHandler: initializing] INFO  thl.ConnectorHandler Log epoch numbers 
checked and match: client source ID=ubuntu3 seqno=1 epoch number=0
INFO   | jvm 1    | 2014/10/27 04:27:20 | 2014-10-27 04:27:20,466 [cookbook - 
connector-handler-ubuntu3] INFO  thl.ConnectorHandler Checking first seqno 
returned by THL for consistency: client expected seqno=2 server returned seqno=2
INFO   | jvm 1    | 2014/10/27 04:29:00 | 2014-10-27 04:29:00,814 [cookbook - 
binlog-to-q-0] INFO  extractor.mysql.MySQLExtractor Skipping over server stop 
event in log: mysql-bin.000004:0000000000005074;-1
INFO   | jvm 1    | 2014/10/27 04:31:00 | 2014-10-27 04:31:00,822 [cookbook - 
binlog-to-q-0] WARN  extractor.mysql.MySQLExtractor Current log file appears to 
be missing log-rotate event: mysql-bin.000004
INFO   | jvm 1    | 2014/10/27 04:31:00 | 2014-10-27 04:31:00,823 [cookbook - 
binlog-to-q-0] INFO  extractor.mysql.MySQLExtractor Auto-generating log-rotate 
event for next binlog file: mysql-bin.000005
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,607 [cookbook - 
q-to-thl-0] ERROR pipeline.SingleThreadStageTask Stage task failed: 
stage=q-to-thl seqno=24 fragno=0
INFO   | jvm 1    | 2014/10/27 04:31:04 | 
com.continuent.tungsten.replicator.ReplicatorException: Unable to update last 
commit seqno: Could not read resultset: Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.datasource.SqlCommitSeqnoAccessor.updateLastC
ommitSeqno(SqlCommitSeqnoAccessor.java:204)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.thl.THL.updateCommitSeqno(THL.java:608)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.thl.THLStoreApplier.apply(THLStoreApplier.jav
a:163)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.apply(SingleTh
readStageTask.java:797)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:530)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:179)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.lang.Thread.run(Thread.java:724)
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: 
java.sql.SQLNonTransientConnectionException: Could not read resultset: 
Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:97)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.DrizzlePreparedStatement.executeUpdate(DrizzlePreparedStatement
.java:128)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.datasource.SqlCommitSeqnoAccessor.updateLastC
ommitSeqno(SqlCommitSeqnoAccessor.java:182)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 6 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: 
org.drizzle.jdbc.internal.common.QueryException: Could not read resultset: 
Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:45
3)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.DrizzlePreparedStatement.executeUpdate(DrizzlePreparedStatement
.java:125)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 7 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: java.io.IOException: 
Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.RawPacket.readLengthSeq(RawPacket.java:9
7)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.RawPacket.nextPacket(RawPacket.java:51)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.SyncPacketFetcher.getRawPacket(SyncPacke
tFetcher.java:42)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:45
0)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 8 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,608 [cookbook - 
q-to-thl-0] INFO  pipeline.SingleThreadStageTask Performing emergency rollback 
of applied changes
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,608 [cookbook - 
q-to-thl-0] INFO  pipeline.SingleThreadStageTask Dispatching error event: Stage 
task failed: stage=q-to-thl seqno=24 fragno=0
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,608 [cookbook - 
pool-2-thread-1] ERROR management.OpenReplicatorManager Received error 
notification, shutting down services :
INFO   | jvm 1    | 2014/10/27 04:31:04 | Stage task failed: stage=q-to-thl 
seqno=24 fragno=0
INFO   | jvm 1    | 2014/10/27 04:31:04 | 
com.continuent.tungsten.replicator.ReplicatorException: Unable to update last 
commit seqno: Could not read resultset: Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.datasource.SqlCommitSeqnoAccessor.updateLastC
ommitSeqno(SqlCommitSeqnoAccessor.java:204)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.thl.THL.updateCommitSeqno(THL.java:608)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.thl.THLStoreApplier.apply(THLStoreApplier.jav
a:163)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.apply(SingleTh
readStageTask.java:797)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:530)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:179)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.lang.Thread.run(Thread.java:724)
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: 
java.sql.SQLNonTransientConnectionException: Could not read resultset: 
Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:97)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.DrizzlePreparedStatement.executeUpdate(DrizzlePreparedStatement
.java:128)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.datasource.SqlCommitSeqnoAccessor.updateLastC
ommitSeqno(SqlCommitSeqnoAccessor.java:182)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 6 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: 
org.drizzle.jdbc.internal.common.QueryException: Could not read resultset: 
Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:45
3)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.DrizzlePreparedStatement.executeUpdate(DrizzlePreparedStatement
.java:125)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 7 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: java.io.IOException: 
Incomplete read! Expected 4, got -1
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.RawPacket.readLengthSeq(RawPacket.java:9
7)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.RawPacket.nextPacket(RawPacket.java:51)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.SyncPacketFetcher.getRawPacket(SyncPacke
tFetcher.java:42)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:45
0)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 8 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
pool-2-thread-1] WARN  management.OpenReplicatorManager Performing emergency 
service shutdown
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
pool-2-thread-1] INFO  pipeline.Pipeline Shutting down pipeline: master
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
binlog-to-q-0] WARN  pipeline.SingleThreadStageTask Received unexpected 
interrupt in stage task: binlog-to-q
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
binlog-to-q-0] INFO  pipeline.SingleThreadStageTask Terminating processing for 
stage task thread
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
binlog-to-q-0] INFO  pipeline.SingleThreadStageTask Last successfully processed 
event prior to termination: seqno=24 
eventid=mysql-bin.000005:0000000000000274;40
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
binlog-to-q-0] INFO  pipeline.SingleThreadStageTask Task event count: 23
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
q-to-thl-0] INFO  pipeline.SingleThreadStageTask Terminating processing for 
stage task thread
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
q-to-thl-0] INFO  pipeline.SingleThreadStageTask Last successfully processed 
event prior to termination: seqno=24 
eventid=mysql-bin.000005:0000000000000274;40
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
q-to-thl-0] INFO  pipeline.SingleThreadStageTask Task event count: 23
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
pool-2-thread-1] INFO  pipeline.Pipeline Releasing pipeline: master
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
pool-2-thread-1] INFO  pipeline.StageTaskGroup Releasing tasks for stage: 
binlog-to-q
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,609 [cookbook - 
pool-2-thread-1] INFO  extractor.ExtractorWrapper Releasing raw extractor and 
heartbeat filter
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,610 [cookbook - 
pool-2-thread-1] INFO  pipeline.StageTaskGroup Releasing tasks for stage: 
q-to-thl
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,610 [cookbook - 
pool-2-thread-1] INFO  thl.Server Stopping server thread
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,610 [cookbook - 
THL Server [thl:0.0.0.0:2112]] INFO  thl.Server Server thread cancelled
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,610 [cookbook - 
THL Server [thl:0.0.0.0:2112]] INFO  thl.Server Closing connector handlers for 
THL Server: store=thl
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,610 [cookbook - 
THL Server [thl:0.0.0.0:2112]] WARN  thl.ConnectorHandler Connector handler 
stop operation was interrupted
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,610 [cookbook - 
connector-handler-ubuntu2] INFO  thl.ConnectorHandler Connector handler 
cancelled
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,611 [cookbook - 
connector-handler-ubuntu2] INFO  thl.ConnectorHandler Terminating THL client 
connection from source ID: ubuntu2
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,611 [cookbook - 
connector-handler-ubuntu3] INFO  thl.ConnectorHandler Connector handler 
cancelled
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,611 [cookbook - 
connector-handler-ubuntu3] INFO  thl.ConnectorHandler Terminating THL client 
connection from source ID: ubuntu3
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,611 [cookbook - 
THL Server [thl:0.0.0.0:2112]] INFO  thl.Server Closing socket: store=thl 
host=/0.0.0.0:2112 port=2112
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,611 [cookbook - 
THL Server [thl:0.0.0.0:2112]] INFO  thl.Server THL thread done: store=thl
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,612 [cookbook - 
pool-2-thread-1] WARN  database.AbstractDatabase Unable to close connection
INFO   | jvm 1    | 2014/10/27 04:31:04 | 
java.sql.SQLNonTransientConnectionException: Could not close connection: Broken 
pipe
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:97)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.DrizzleConnection.close(DrizzleConnection.java:235)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.database.AbstractDatabase.disconnect(Abstract
Database.java:209)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.database.AbstractDatabase.close(AbstractDatab
ase.java:930)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.thl.THL.release(THL.java:544)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.conf.ReplicatorRuntime.releasePlugin(Replicat
orRuntime.java:1331)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.pipeline.Pipeline.release(Pipeline.java:303)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.conf.ReplicatorRuntime.release(ReplicatorRunt
ime.java:658)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.management.tungsten.TungstenPlugin.doShutdown
(TungstenPlugin.java:1341)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.management.tungsten.TungstenPlugin.offline(Tu
ngstenPlugin.java:606)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.replicator.management.OpenReplicatorManager$ErrorShutdow
nAction.doAction(OpenReplicatorManager.java:1140)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.fsm.core.StateMachine.applyEvent(StateMachine.java:265)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
com.continuent.tungsten.fsm.event.EventProcessor.run(EventProcessor.java:78)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:508)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.lang.Thread.run(Thread.java:724)
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: 
org.drizzle.jdbc.internal.common.QueryException: Could not close connection: 
Broken pipe
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.mysql.MySQLProtocol.close(MySQLProtocol.java:268)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.DrizzleConnection.close(DrizzleConnection.java:233)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 16 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | Caused by: java.net.SocketException: 
Broken pipe
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.net.SocketOutputStream.socketWrite0(Native Method)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.net.SocketOutputStream.write(SocketOutputStream.java:153)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.common.packet.commands.ClosePacket.send(ClosePacket.ja
va:47)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   at 
org.drizzle.jdbc.internal.mysql.MySQLProtocol.close(MySQLProtocol.java:262)
INFO   | jvm 1    | 2014/10/27 04:31:04 |   ... 17 more
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,613 [cookbook - 
pool-2-thread-1] INFO  datasource.SqlDataSource Attempting to reduce catalog 
data: data source=global
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,619 [cookbook - 
pool-2-thread-1] INFO  datasource.SqlCommitSeqno Reduced 0 task entries: 
tungsten_cookbook.trep_commit_seqno
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,619 [cookbook - 
pool-2-thread-1] INFO  management.OpenReplicatorManager All internal services 
are shut down; replicator ready for recovery
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,619 [cookbook - 
pool-2-thread-1] INFO  management.OpenReplicatorManager Sent State Change 
Notification ONLINE -> OFFLINE:ERROR
INFO   | jvm 1    | 2014/10/27 04:31:04 | 2014-10-27 04:31:04,620 [cookbook - 
pool-2-thread-1] WARN  management.OpenReplicatorManager Received irrelevant 
event for current state: state=OFFLINE:ERROR event=OfflineNotification

Original comment by g.maxia on 27 Oct 2014 at 4:35

GoogleCodeExporter commented 9 years ago
That behavior is expected.  The replicator cannot update the trep_commit_seqno 
position while the DBMS is down. 

You should be able to suppress it using the following option to false: 

# Allow this replication service to keep on storing THL events even if it failed
# to store its position into database. This is not activated by default and 
# does require that THL syncTHLWithExtractor property is set to true (master 
case).
replicator.store.thl.stopOnDBError=false

When stopOnDBError is false, the THL code ignores errors that occur when trying 
to update the trep_commit_seqno.  You should be able to do this with the 
following tpm option: 

--property=replicator.store.thl.stopOnDBError=false

Original comment by robert.h...@continuent.com on 27 Oct 2014 at 4:42

GoogleCodeExporter commented 9 years ago
Robert,
Thanks for the update. There are still issues, though.
The replicator is not trying to update trep_commit_seqno while the DBMS is down.

The sequence of events is:
trepctl properties -filter stopOnDBError
{"replicator.store.thl.stopOnDBError":"false"}

trepctl status -> shows ONLINE
# stop mysql
trepctl status -> shows ONLINE
mysql commit something
trepctl status -> shows OFFLINE:ERROR

I repeated the test after updating the replicator with stopOnDBError=false, but 
the behavior is the same.

Original comment by g.maxia on 27 Oct 2014 at 4:55

GoogleCodeExporter commented 9 years ago
This appears to be an unrelated error as trepctl status is no longer failing as 
before.  I would recommend filing a new Issue.  It appears that stopOnDBError 
is perhaps not working.  

Original comment by robert.h...@continuent.com on 27 Oct 2014 at 5:02

GoogleCodeExporter commented 9 years ago
The issue as defined initially is fixed. 
When the replicator comes back online, there are problems that are unrelated 
with this matter, that will be resolved with a separate issue

Original comment by g.maxia on 27 Oct 2014 at 11:15

GoogleCodeExporter commented 9 years ago

Original comment by mc.br...@continuent.com on 28 Oct 2014 at 11:54