epermana / tungsten-replicator

Automatically exported from code.google.com/p/tungsten-replicator
1 stars 0 forks source link

Stage task failed: stage=d-pq-to-dbms StringIndexOutOfBoundsException #1077

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Upgrade 2.2.0 (direct mode) to 3.0.0 build 524 (direct mode)
2. Enable channels > 1 on service replicating multiple schemas from MySQL 
5.6.16 to 5.6.22
3. Wait between 1 and 20 seconds until the errors occurs

What is the expected output?
Multiple channels applying on the slave servers.

What do you see instead?

INFO   | jvm 1    | 2014/12/19 10:19:00 | 2014-12-19 10:19:00,894 
[xbn03toxbndw03 - d-pq-to-dbms-1] ERROR pipeline.SingleThreadStageTask Stage 
task failed: stage=d-pq-to-dbms seqno=7225926729 fragno=0
INFO   | jvm 1    | 2014/12/19 10:19:00 | 
java.lang.StringIndexOutOfBoundsException: String index out of range: -14
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
java.lang.String.substring(String.java:1938)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.database.MySQLOperationStringBuilder.peek(MyS
QLOperationStringBuilder.java:181)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.database.MySQLOperationStringBuilder.build(My
SQLOperationStringBuilder.java:124)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.database.MySQLOperationMatcher.match(MySQLOpe
rationMatcher.java:281)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.database.SqlStatementParser.parse(SqlStatemen
tParser.java:79)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.filter.ReplicateFilter.filter(ReplicateFilter
.java:180)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:381)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:179)
INFO   | jvm 1    | 2014/12/19 10:19:00 |       at 
java.lang.Thread.run(Thread.java:619)
INFO   | jvm 1    | 2014/12/19 10:19:00 | 2014-12-19 10:19:00,896 
[xbn03toxbndw03 - d-pq-to-dbms-1] INFO  pipeline.SingleThreadStageTask 
Performing emergency rollback of applied changes
INFO   | jvm 1    | 2014/12/19 10:19:01 | 2014-12-19 10:19:01,322 
[xbn03toxbndw03 - d-pq-to-dbms-1] INFO  pipeline.SingleThreadStageTask 
Dispatching error event: Stage task failed: stage=d-pq-to-dbms seqno=7225926729 
fragno=0
INFO   | jvm 1    | 2014/12/19 10:19:01 | 2014-12-19 10:19:01,323 
[xbn03toxbndw03 - d-pq-to-dbms-1] INFO  pipeline.SingleThreadStageTask 
Terminating processing for stage task thread
INFO   | jvm 1    | 2014/12/19 10:19:01 | 2014-12-19 10:19:01,323 
[xbn03toxbndw03 - d-pq-to-dbms-1] INFO  pipeline.SingleThreadStageTask Last 
successfully processed event prior to termination: seqno=7225926729 
eventid=mysql-bin.521642:00
00000002964715;3628392
INFO   | jvm 1    | 2014/12/19 10:19:01 | 2014-12-19 10:19:01,323 
[xbn03toxbndw03 - d-pq-to-dbms-1] INFO  pipeline.SingleThreadStageTask Task 
event count: 50857
INFO   | jvm 1    | 2014/12/19 10:19:01 | 2014-12-19 10:19:01,324 
[xbn03toxbndw03 - pool-2-thread-1] ERROR management.OpenReplicatorManager 
Received error notification, shutting down services :
INFO   | jvm 1    | 2014/12/19 10:19:01 | Stage task failed: stage=d-pq-to-dbms 
seqno=7225926729 fragno=0
INFO   | jvm 1    | 2014/12/19 10:19:01 | 
java.lang.StringIndexOutOfBoundsException: String index out of range: -14
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
java.lang.String.substring(String.java:1938)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.database.MySQLOperationStringBuilder.peek(MyS
QLOperationStringBuilder.java:181)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.database.MySQLOperationStringBuilder.build(My
SQLOperationStringBuilder.java:124)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.database.MySQLOperationMatcher.match(MySQLOpe
rationMatcher.java:281)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.database.SqlStatementParser.parse(SqlStatemen
tParser.java:79)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.filter.ReplicateFilter.filter(ReplicateFilter
.java:180)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(Single
ThreadStageTask.java:381)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThre
adStageTask.java:179)
INFO   | jvm 1    | 2014/12/19 10:19:01 |       at 
java.lang.Thread.run(Thread.java:619)

And the sequence looks like:
thl list -seqno 7225926729
SEQ# = 7225926729 / FRAG# = 0 (last frag)
- TIME = 2014-12-17 23:31:52.0
- EPOCH# = 7225537146
- EVENTID = mysql-bin.521642:0000000002964715;3628392
- SOURCEID = 172.30.*.*
- METADATA = 
[mysql_server_id=40;dbms_type=mysql;service=SERVICENAME;shard=shard09]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 0, 
foreign_key_checks = 1, unique_checks = 1, sql_mode = 'NO_ENGINE_SUBSTITUTION', 
character_set_client = 8, collation_connection = 8, collation_server = 8]
- SCHEMA = shard09
- SQL(0) = insert ignore into com_status_1203 (node_id, log_date, gw_time, 
eth_udp, eth_tcp, eth_http, pstn_dtmf, pstn_data, gsm_dtmf, gsm_data, gsm_sms, 
gprs_udp, gprs_tcp, replication_time)
        values ( NAME_CONST('p_node_id',17237109),  NAME_CONST('p_log_date',_latin1'2014-12-17 22:31:52' COLLATE 'latin1_swedish_ci'),  NAME_CONST('p_gw_time',_latin1'2014-12-17 22:31:43' COLLATE 'latin1_swedish_ci'),  NAME_CONST('p_eth_udp',3),  NAME_CONST('p_eth_tcp',0),  NAME_CONST('p_eth_http',0),  NAME_CONST('p_pstn_dtmf',0),  NAME_CONST('p_pstn_data',0),  NAME_CONST('p_gsm_dtmf',0),  NAME_CONST('p_gsm_data',5),  NAME_CONST('p_gsm_sms',5),  NAME_CONST('p_gprs_udp',5),  NAME_CONST('p_gprs_tcp',0), utc_timestamp())

What version of the product are you using?
3.0.0 524 

On what operating system?
OpenSuse 11.1

Please provide any additional information below.

Notes:

When I have been looking at this I have changed the number of channels that is 
used and when --channel=1 it works. If I change channels to > 1 it stops within 
a couple of minutes or seconds if I set it > 4. On the servers replicating only 
two schemas (where one is almost no writes) I can set --channels=2 and it 
works. But on a server with five schemas and about the same amount of writes on 
each schema (total 2K-3K inserts/updates per second on all five) the replicator 
crashes with these errors described.
Our problem is that we replicating five (or more) schemas from multiple servers 
(one direct replication set up for each server still) so with only 1 channel 
the replication will not catch up. When running 2.2.0 we had other issues but 
at least the parallel applier worked for us.

Original issue reported on code.google.com by olofmatt...@gmail.com on 22 Dec 2014 at 9:09