zendesk / maxwell

Maxwell's daemon, a mysql-to-json kafka producer
https://maxwells-daemon.io/
Other
4k stars 1.01k forks source link

maxwell producer failed #867

Closed saup007 closed 6 years ago

saup007 commented 6 years ago

MySQL 5.5 binlog_format=ROW kafka_version 0.9.0.1 maxwell 1.13.0

[root@10 maxwell-1.13.0]# grep '^[^$#]'  config.properties
log_level=info
producer=kafka
kafka.bootstrap.servers=localhost:9092
host=localhost
user=maxwell
password=maxwell
kafka.batch.size=16384
linger.ms=0
kafka.metadata.fetch.timeout.ms=5000
kafka.retries=0
kafka.acks=1
producer_partition_by=table
kinesis_stream=maxwell

bin/maxwell --user=super --password=password --host=mysql_host_medata --port=3988 --producer=kafka --kafka.bootstrap.servers=kfk9:19504 --kafka_topic=mysqltokafka3312_test --output_ddl=true --ddl_kafka_topic=mysqltokafka3312_test --schema_database=mysqltokafka3312 --replication_host=mysql_host_rep --replication_port=3312 --replication_user=mysqlha_user --replication_password='xxxxxxxx' --output_binlog_position=true --include_dbs=db_lica --client_id=101 --replica_server_id=1001 --binlog_connector=true --kafka_version=0.9.0.1

20:59 maxwell process Running, Maxwell producer Running 22:14 maxwell process Running, Maxwell producer stop;

log

Using kafka version: 0.9.0.1
20:59:02,073 WARN  MaxwellConfig - disabling bootstrapping; not available when using a separate replication host.
20:59:02,089 WARN  MaxwellMetrics - Metrics will not be exposed: metricsReportingType not configured.
20:59:02,608 INFO  ProducerConfig - ProducerConfig values:
        compression.type = none
        metric.reporters = []
        metadata.max.age.ms = 300000
        metadata.fetch.timeout.ms = 5000
        reconnect.backoff.ms = 50
        sasl.kerberos.ticket.renew.window.factor = 0.8
        bootstrap.servers = [kfk9:19504]
        retry.backoff.ms = 100
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        buffer.memory = 33554432
        timeout.ms = 30000
        key.serializer = class org.apache.kafka.common.serialization.StringSerializer
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        ssl.keystore.type = JKS
        ssl.trustmanager.algorithm = PKIX
        block.on.buffer.full = false
        ssl.key.password = null
        max.block.ms = 60000
        sasl.kerberos.min.time.before.relogin = 60000
        connections.max.idle.ms = 540000
        ssl.truststore.password = null
        max.in.flight.requests.per.connection = 5
        metrics.num.samples = 2
        client.id =
        ssl.endpoint.identification.algorithm = null
        ssl.protocol = TLS
        request.timeout.ms = 30000
        ssl.provider = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        acks = 1
        batch.size = 16384
        ssl.keystore.location = null
        receive.buffer.bytes = 32768
        ssl.cipher.suites = null
        ssl.truststore.type = JKS
        security.protocol = PLAINTEXT
        retries = 0
        max.request.size = 1048576
        value.serializer = class org.apache.kafka.common.serialization.StringSerializer
        ssl.truststore.location = null
        ssl.keystore.password = null
        ssl.keymanager.algorithm = SunX509
        metrics.sample.window.ms = 30000
        partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
        send.buffer.bytes = 131072
        linger.ms = 0

20:59:02,617 WARN  KafkaProducer - metadata.fetch.timeout.ms config is deprecated and will be removed soon. Please use max.block.ms
20:59:02,666 INFO  AppInfoParser - Kafka version : 0.9.0.1
20:59:02,667 INFO  AppInfoParser - Kafka commitId : 23c69d62a0cabf06
20:59:02,697 INFO  Maxwell - Maxwell v?? is booting (MaxwellKafkaProducer), starting at Position[BinlogPosition[mysql-bin.001464:111801019], lastHeartbeat=0]
20:59:02,908 INFO  MysqlSavedSchema - Restoring schema id 2 (last modified at Position[BinlogPosition[mysql-bin.001464:93163683], lastHeartbeat=0])
20:59:03,075 INFO  MysqlSavedSchema - Restoring schema id 1 (last modified at Position[BinlogPosition[mysql-bin.001464:1303454], lastHeartbeat=0])
20:59:03,231 INFO  MysqlSavedSchema - beginning to play deltas...
20:59:03,232 INFO  MysqlSavedSchema - played 1 deltas in 1ms
20:59:03,262 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.001464:111801019
20:59:03,326 INFO  BinaryLogClient - Connected to mysql_host_rep:3312 at mysql-bin.001464/111801019 (sid:1001, cid:28964052)
20:59:03,326 INFO  BinlogConnectorLifecycleListener - Binlog connected.
20:59:03,490 WARN  NetworkClient - Error while fetching metadata with correlation id 0 : {mysqltokafka3312_test=LEADER_NOT_AVAILABLE}
20:59:03,686 WARN  NetworkClient - Error while fetching metadata with correlation id 1 : {mysqltokafka3312_test=LEADER_NOT_AVAILABLE}
20:59:03,796 WARN  NetworkClient - Error while fetching metadata with correlation id 2 : {mysqltokafka3312_test=LEADER_NOT_AVAILABLE}
20:59:03,900 WARN  NetworkClient - Error while fetching metadata with correlation id 3 : {mysqltokafka3312_test=LEADER_NOT_AVAILABLE}
22:00:55,640 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:211987697], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":147983
71}
22:00:55,640 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:00:55,640 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:211987448], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":147983
70}
22:00:55,640 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:00:55,640 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:211987196], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":147983
69}
22:00:55,640 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:00:55,641 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:211986944], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":147983
68}
22:00:55,641 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:00:55,641 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:211986692], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":147983
67}
22:00:55,641 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:00:55,641 ERROR MaxwellKafkaProducer - TimeoutException @ Position[BinlogPosition[mysql-bin.001464:211987946], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":147983
72}
...
...
...
22:00:55,748 ERROR MaxwellKafkaProducer - Batch Expired
22:00:55,748 ERROR MaxwellKafkaProducer - TimeoutException @ Position[BinlogPosition[mysql-bin.001464:212248012], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":14799356}
22:00:55,748 ERROR MaxwellKafkaProducer - Batch Expired
22:00:55,748 ERROR MaxwellKafkaProducer - TimeoutException @ Position[BinlogPosition[mysql-bin.001464:212248265], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_stock_plate_symbol","pk.id":14799357}
22:00:55,748 ERROR MaxwellKafkaProducer - Batch Expired
22:14:27,244 ERROR Sender - Uncaught error in kafka producer I/O thread:
java.lang.IllegalStateException: Correlation id for response (83086) does not match request (83078)
        at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) [kafka-clients-0.9.0.1.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
22:14:31,289 ERROR Sender - Uncaught error in kafka producer I/O thread:
java.lang.IllegalStateException: Correlation id for response (83097) does not match request (83086)
        at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) [kafka-clients-0.9.0.1.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
22:14:31,446 ERROR Sender - Uncaught error in kafka producer I/O thread:
java.lang.IllegalStateException: Correlation id for response (83098) does not match request (83097)
        at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) [kafka-clients-0.9.0.1.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
22:14:31,447 ERROR Sender - Uncaught error in kafka producer I/O thread:
java.lang.IllegalStateException: Correlation id for response (83100) does not match request (83098)
        at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) [kafka-clients-0.9.0.1.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
...
...
...
22:16:31,880 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:227959015], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_user_2","pk.id":1320783}
22:16:31,881 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:16:31,881 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:227960753], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_symbol_relation","pk.id":5950029}
22:16:31,881 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:16:31,881 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:227959246], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_symbol_relation","pk.id":5950028}
22:16:31,881 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:16:31,881 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001464:227929749], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_user_8","pk.id":1445142}
22:16:31,881 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
22:23:03,349 INFO  BinaryLogClient - Trying to restore lost connection to mysql_host_rep:3312

Three , no change in position

+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
| server_id | binlog_file      | binlog_position | gtid_set | client_id | heartbeat_at | last_heartbeat_read |
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
| 301403312 | mysql-bin.001464 |       222500083 | NULL     | 101       |         NULL |                   0 |
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
1 row in set (0.00 sec)

mysql@mysqltokafka3312>select * from `positions`;
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
| server_id | binlog_file      | binlog_position | gtid_set | client_id | heartbeat_at | last_heartbeat_read |
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
| 301403312 | mysql-bin.001464 |       222500083 | NULL     | 101       |         NULL |                   0 |
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
1 row in set (0.00 sec)

mysql@.mysqltokafka3312>select * from `positions`;
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
| server_id | binlog_file      | binlog_position | gtid_set | client_id | heartbeat_at | last_heartbeat_read |
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
| 301403312 | mysql-bin.001464 |       222500083 | NULL     | 101       |         NULL |                   0 |
+-----------+------------------+-----------------+----------+-----------+--------------+---------------------+
1 row in set (0.00 sec)
osheroff commented 6 years ago

newer kafka clients are incompatible with older kafka servers. If your kafka cluster is running 0.8.1 please pass --kafka_version=0.8.2.2 to maxwell.

saup007 commented 6 years ago

@osheroff my kafka cluster 0.9.0.1 This is very strangešŸ˜‚

--kafka_version=0.9.0.1

osheroff commented 6 years ago

ah, ok. then, you should investigate:

20:59:03,490 WARN  NetworkClient - Error while fetching metadata with correlation id 0 : {mysqltokafka3312_test=LEADER_NOT_AVAILABLE}

I believe there's a problem with the setup of your kafka cluster.

saup007 commented 6 years ago

@osheroff first start maxwell, kafka cluster not exists topic,Once again there is no such warning.

Now a temporary solution, check the position of the table, 10 minutes no change, kill maxwell process and start again

ericwush commented 6 years ago

@saup007 do you configure kafka to create topic if it doesn't exist?

# disable auto-creation of topics
auto.create.topics.enable=false|true

If not, the topic has to be created before maxwell being started.

saup007 commented 6 years ago

@osheroff @ericwush thanks.

My focus is not whether to automatically create topics. that Maxwell does not continue to parse mysql binlog and write to kafka cluster.

May be maxwell access mysql network problems, maxwell process still exists, but there is no connection mysql master, and can not see in the mysql master sent to maxwell binlog Datadump thread

Is there always a retry strategy for connecting to MySQL or writing to kafka cluster? There may be a few seconds of network problems, not serious, if there is a sound retry strategy, is still available in a weak network

03:45:03,456 ERROR Sender - Uncaught error in kafka producer I/O thread:
java.lang.IllegalStateException: Correlation id for response (713518) does not match request (713517)
        at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) [kafka-clients-0.9.0.1.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
03:45:03,456 ERROR Sender - Uncaught error in kafka producer I/O thread:
java.lang.IllegalStateException: Correlation id for response (713519) does not match request (713518)
        at org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) ~[kafka-clients-0.9.0.1.jar:?]
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) [kafka-clients-0.9.0.1.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
03:45:33,478 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001466:621287663], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_view","pk.id":703273}
03:45:33,478 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
03:45:33,478 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001466:616715881], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_symbol_relation","pk.id":5993756}
03:45:33,478 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
03:45:33,478 ERROR MaxwellKafkaProducer - NetworkException @ Position[BinlogPosition[mysql-bin.001466:616201578], lastHeartbeat=0] -- {"database":"db_licai","table":"lcs_cron_log","pk.id":4018836}
03:45:33,478 ERROR MaxwellKafkaProducer - The server disconnected before a response was received.
04:01:06,158 INFO  BinaryLogClient - Trying to restore lost connection to mysql_host_rep:3312
ericwush commented 6 years ago

I don't quite get what you meant by

first start maxwell, kafka cluster not exists topic

Anyways your log shows it's kafka related. If it's binlog connection issue it should error out earlier and you wouldn't see kafka errors. LEADER_NOT_AVAILABLE is a retriable error and you really should consult kafka docs to configure for your needs. In this case, consider kafka.retries=50 in maxwell's config.properties to see if it helps. If yes, tune it to the value that suits your needs.

osheroff commented 6 years ago

@ericwush's suggestion to raise kafka retries in the face of a junky network is probably fine. feel free to re-open if this doesn't help.

saup007 commented 6 years ago

May be Kafka Bug https://issues.apache.org/jira/browse/KAFKA-4669