zendesk / maxwell

Maxwell's daemon, a mysql-to-json kafka producer
https://maxwells-daemon.io/
Other
3.95k stars 996 forks source link

I use maxwell to synchronize MySQL data to Redis, but there is a delay of nearly 30 minutes. How can I improve Maxwell's performance #2087

Open LPeng111 opened 3 months ago

LPeng111 commented 3 months ago

I use maxwell to synchronize MySQL data to Redis, but there is a delay of nearly 30 minutes. How can I improve Maxwell's performance

Here is my configuration.

# tl;dr config
#     *** general ***
# choose where to produce data to. stdout|file|kafka|kinesis|pubsub|sqs|rabbitmq|redis
producer=redis

# set the log level.  note that you can configure things further in log4j2.xml
#log_level=INFO # [DEBUG, INFO, WARN, ERROR]
log_level=ERROR

#     *** mysql ***

# mysql host to connect to
host=xxx

# mysql port to connect to
port=3307

# mysql user to connect as.  This user must have REPLICATION SLAVE permissions,
# as well as full access to the `maxwell` (or schema_database) database
user=maxwell_user

# mysql password
password=xxx

# options to pass into the jdbc connection, given as opt=val&opt2=val2
#jdbc_options=opt1=100&opt2=hello

# name of the mysql database where maxwell keeps its own state
schema_database=maxwell

replication_host=xxx
replication_user=maxwell_user
replication_password=xxx
replication_port=3307

# records include row query, binlog option "binlog_rows_query_log_events" must be enabled" (default false)
output_row_query=false

# This controls whether maxwell will output JSON information containing
# DDL (ALTER/CREATE TABLE/ETC) infromation. (default: false)
# See also: ddl_kafka_topic
output_ddl=false

#           *** redis ***

redis_host=xxx
redis_port=xxx
redis_auth=xxx
redis_database=0

# name of pubsub/list/whatever key to publish to
redis_key=xxx

# Valid values for redis_type = pubsub|lpush. Defaults to pubsub

redis_type=lpush

#bootstrapper=async [sync, async, none]
bootstrapper=sync

# output filename when using the "file" producer
#output_file=/path/to/file
osheroff commented 3 months ago

how much data? redis is very likely the bottleneck here.

LPeng111 commented 3 months ago

About 500 rows per second. My goal is to periodically transfer data that meets certain criteria to Redis.

ZeeshanIbrahim1 commented 3 months ago

It's possible that the concern lies with the cumulative number of schemas and databases present on the MySQL host. Maxwell's operational procedure involves reading and storing schemas within its designated tables.

In instances (Mysql Host in your case) where a substantial number of databases or larger schemas are present, the initialization of Maxwell upon startup may incur delays. However, once this initialization phase concludes, Maxwell seamlessly continues to ingest binary logs in real-time.