Open lingniao2000 opened 6 years ago
I know that pypy offer better speed performance
Le lun. 12 févr. 2018 à 07:42, lingniao2000 notifications@github.com a écrit :
The last two months, we have been debugging pg_chameleon. It uses python-mysql-replication, the current problems encountered is python-mysql-replication occupy 100% CPU, network recv-Q queue backlog. We through the line-profiler Observe that the two most important statements take the most time
for binlogevent in my_stream: for row in binlogevent.rows:
Please help us, we have no other way. Details are as follows(with cut down):
2018-02-11 11:50:22 read_replica DEBUG mysql_lib.py (1087): Master coordinates: {'File': 'mysql-bin.0 Timer unit: 1e-06 s
Total time: 628.137 s File: /root/chameleon/lib/python3.5/site-packages/pg_chameleon/lib/mysql_lib.py Function: __read_replica_stream at line 881
Line # Hits Time Per Hit % Time Line Contents
881 def __read_replica_stream(self, batch_data): 911 1 62448.0 62448.0 0.0 sql_tokeniser = sql_token() 912 1 1866884.0 1866884.0 0.3 table_type_map = self.get_table_type_map()
933 1 420.0 420.0 0.0 self.logger.debug("log_file %s, log_position %s. 935 225048 210244314.0 934.2 33.5 for binlogevent in my_stream: 936 225048 870337.0 3.9 0.1 if isinstance(binlogevent, RotateEvent): 1008 111908 327791.0 2.9 0.1 if close_batch: 1009 1 6171.0 6171.0 0.0 my_stream.close() 1010 1 10.0 10.0 0.0 return [master_data, close_batch] 1011 else: 1013 226441 301239216.0 1330.3 48.0 for row in binlogevent.rows: 1014 113302 405720.0 3.6 0.1 event_after={} 1015 113302 312578.0 2.8 0.0 event_before={} 1084
1085 100000 348828.0 3.5 0.1 if len(group_insert)>=self.replicabatch 1086 1 372.0 372.0 0.0 self.logger.info("Max rows per batch re 1087 1 227.0 227.0 0.0 self.logger.debug("Master coordinates: 1088 1 39953572.0 39953572.0 6.4 self.pg_engine.write_batch(group_inse 1089 1 8.0 8.0 0.0 size_insert=0(chameleon)[root@QH-PRD-STGE-MYSQL-MHA08 pg_chameleon]# strace -c -p 52306 strace: Process 52306 attached ^Cstrace: Process 52306 detached % time seconds usecs/call calls errors syscall
92.48 0.055257 0 1087063 ioctl 7.33 0.004382 0 10156 recvfrom 0.19 0.000112 0 442 stat 0.00 0.000000 0 1 sendto
100.00 0.059751 1097662 total
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/noplay/python-mysql-replication/issues/251, or mute the thread https://github.com/notifications/unsubscribe-auth/AAVFXRJJmH0pFLJoksIiYYKrodtQyQMHks5tT91BgaJpZM4SBwGB .
hi, just to add more informations. as I'm using psycopg2 the pypy could be a problem. looks like there is a compatibility layer but I've never tried it. https://pypi.python.org/pypi/psycopg2cffi
Thanks
deploy pypy3 , test and found there are indeed have compatibility issues
(venv_chameleon) [root@MYSQL-SQLCHECK ~]# chameleon.py init_replica --source prod --config default --debug
2018-02-13 10:50:22 MainProcess DEBUG mysql_lib.py (1147): starting init replica for source prod
2018-02-13 10:50:22 MainProcess DEBUG pg_lib.py (615): Changing the autocommit flag to True
2018-02-13 10:50:22 MainProcess DEBUG pg_lib.py (2677): Collecting schema mappings for source prod
2018-02-13 10:50:22 MainProcess INFO pg_lib.py (1534): deleting all the table references from the replica catalog for source prod
2018-02-13 10:50:23 MainProcess DEBUG mysql_lib.py (233): Creating the schema _db_nono_tmp.
Traceback (most recent call last):
File "/root/venv_chameleon/bin/chameleon.py", line 54, in <module>
getattr(replica, args.command)()
File "/root/venv_chameleon/site-packages/pg_chameleon/lib/global_lib.py", line 316, in init_replica
self.__init_mysql_replica()
File "/root/venv_chameleon/site-packages/pg_chameleon/lib/global_lib.py", line 326, in __init_mysql_replica
self.mysql_source.init_replica()
File "/root/venv_chameleon/site-packages/pg_chameleon/lib/mysql_lib.py", line 1156, in init_replica
self.create_destination_schemas()
File "/root/venv_chameleon/site-packages/pg_chameleon/lib/mysql_lib.py", line 234, in create_destination_schemas
self.pg_engine.create_database_schema(loading_schema)
File "/root/venv_chameleon/site-packages/pg_chameleon/lib/pg_lib.py", line 3294, in create_database_schema
self.pgsql_cur.execute(sql_create)
File "/root/venv_chameleon/site-packages/psycopg2cffi/_impl/cursor.py", line 30, in check_closed_
return func(self, *args, **kwargs)
File "/root/venv_chameleon/site-packages/psycopg2cffi/_impl/cursor.py", line 263, in execute
self._pq_execute(self._query, conn._async)
File "/root/venv_chameleon/site-packages/psycopg2cffi/_impl/cursor.py", line 690, in _pq_execute
pgconn, util.ascii_to_bytes(query))
File "/root/venv_chameleon/site-packages/psycopg2cffi/_impl/adapters.py", line 306, in ascii_to_bytes
assert isinstance(s, six.binary_type)
AssertionError
The last two months, we have been debugging pg_chameleon. It uses python-mysql-replication, the current problems encountered is python-mysql-replication occupy 100% CPU, network recv-Q queue backlog. We through the line-profiler Observe that the two most important statements take the most time
for binlogevent in my_stream: for row in binlogevent.rows:
Please help us, we have no other idea. Details are as follows(with cut down):
pg_chameleon issue https://github.com/the4thdoctor/pg_chameleon/issues/62