julien-duponchelle / python-mysql-replication

Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL
2.33k stars 680 forks source link

AssertionError: Result length not requested length #474

Closed noman55 closed 1 year ago

noman55 commented 1 year ago

I am using 'mysql-replication==0.25' pymysql version 'pymysql==0.9.3', MySQL RDS version = 5.7 Still getting the error AssertionError: Result length not requested length: Expected=2958. Actual=30. Position: 64250. Data Length: 64280

Traceback (most recent call last):

File "/usr/local/lib/python3.7/site-packages/tap_mysql/lib/streamer.py", line 173, in _extract_records

for row in binlog_event.rows:

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 443, in rows

self._fetch_rows()

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 438, in _fetch_rows

self.__rows.append(self._fetch_one_row())

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 491, in _fetch_one_row

row["values"] = self._read_column_data(self.columns_present_bitmap)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/row_event.py", line 191, in _read_column_data

values[name] = self.packet.read_binary_json(column.length_size)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 350, in read_binary_json

return self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 357, in read_binary_json_type

return self.read_binary_json_array(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in read_binary_json_array

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in <listcomp>

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 461, in _read

return self.read_binary_json_type(x[0], length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 438, in read_binary_json_object

data = self.read_binary_json_type(t, length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 357, in read_binary_json_type

return self.read_binary_json_array(length - 1, large)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in read_binary_json_array

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 463, in <listcomp>

return [_read(x) for x in values_type_offset_inline]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 461, in _read

return self.read_binary_json_type(x[0], length)

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 355, in read_binary_json_type

return self.read_binary_json_object(length - 1, large)=

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 430, in read_binary_json_object

Copy
keys = [self.read(x[1]) for x in key_offset_lengths]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 430, in <listcomp>

keys = [self.read(x[1]) for x in key_offset_lengths]

File "/usr/local/lib/python3.7/site-packages/pymysqlreplication/packet.py", line 150, in read

return data + self.packet.read(size - len(data))

File "/usr/local/lib/python3.7/site-packages/pymysql/protocol.py", line 73, in read

raise AssertionError(error)

AssertionError: Result length not requested length:

Expected=2958. Actual=30. Position: 64250. Data Length: 64280
sean-k1 commented 1 year ago

@noman55 hello let me know Binary Log Formats value and Please let me know how I can reproduce this issue.

abdullahmafuz2020 commented 1 year ago

@sean-k1 you can reproduce a similar issue if you run the below steps. DB setting

  1. MySQL V 8.0.31
  2. binlog_format = ROW
  3. mysql-replication==0.43
CREATE TABLE setting_table
(
    id SERIAL AUTO_INCREMENT,
    setting JSON
)

INSERT INTO setting_table (setting) VALUES ('{"btn": true, "model": false}');

UPDATE setting_table
        SET setting = JSON_REMOVE(setting, '$.model')
       WHERE id=1;
Traceback (most recent call last):
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/app.py", line 129, in <module>
    main()
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/app.py", line 98, in main
    for row in binlogevent.rows:
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 480, in rows
    self._fetch_rows()
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 475, in _fetch_rows
    self.__rows.append(self._fetch_one_row())
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 564, in _fetch_one_row
    row["before_values"] = self._read_column_data(self.columns_present_bitmap)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 115, in _read_column_data
    values[name] = self.__read_values_name(column, null_bitmap, null_bitmap_index,
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/row_event.py", line 231, in __read_values_name
    return self.packet.read_binary_json(column.length_size)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/packet.py", line 368, in read_binary_json
    payload = self.read(length)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysqlreplication/packet.py", line 170, in read
    return self.packet.read(size)
  File "/opt/miniconda3/lib/python3.10/site-packages/pymysql/protocol.py", line 74, in read
    raise AssertionError(error)
AssertionError: Result length not requested length:
Expected=1567384932.  Actual=3.  Position: 114.  Data Length: 117
sean-k1 commented 1 year ago

@abdullahmafuz2020 @noman55 I'm pretty not sure This Pr resolve this Issue Can you check this Issue in this PR?

I need your Many test cases. IF error goes on, Please let me know how to reproduce this issue.

=== QueryEvent ===
Date: 2023-09-20T07:47:28
Log position: 978
Event size: 142
Read bytes: 142
Schema: b'test'
Execution time: 0
Query: CREATE TABLE setting_table
(
    id SERIAL AUTO_INCREMENT,
    setting JSON
)

=== QueryEvent ===
Date: 2023-09-20T07:47:28
Log position: 1132
Event size: 52
Read bytes: 52
Schema: b'test'
Execution time: 0
Query: BEGIN

=== TableMapEvent ===
Date: 2023-09-20T07:47:28
Log position: 1212
Event size: 57
Read bytes: 57
Table id: 296
Schema: test
Table: setting_table
Columns: 2
=== OptionalMetaData ===
unsigned_column_list: [True, False]
default_charset_collation: None
charset_collation: {}
column_charset: []
column_name_list: ['id', 'setting']
set_str_value_list : []
set_enum_str_value_list : []
geometry_type_list : []
simple_primary_key_list: [0]
primary_keys_with_prefix: {}
visibility_list: [True, True]
charset_collation_list: []
enum_and_set_collation_list: []

=== WriteRowsEvent ===
Date: 2023-09-20T07:47:28
Log position: 1287
Event size: 52
Read bytes: 12
Table: test.setting_table
Affected columns: 2
Changed rows: 1
Column Name Information Flag: True
Values:
--
* id : 1
* setting : {b'btn': True, b'model': False}

=== XidEvent ===
Date: 2023-09-20T07:47:28
Log position: 1318
Event size: 8
Read bytes: 8
Transaction ID: 18004

=== QueryEvent ===
Date: 2023-09-20T07:54:27
Log position: 1481
Event size: 61
Read bytes: 61
Schema: b'test'
Execution time: 0
Query: BEGIN

=== TableMapEvent ===
Date: 2023-09-20T07:54:27
Log position: 1561
Event size: 57
Read bytes: 57
Table id: 296
Schema: test
Table: setting_table
Columns: 2
=== OptionalMetaData ===
unsigned_column_list: [True, False]
default_charset_collation: None
charset_collation: {}
column_charset: []
column_name_list: ['id', 'setting']
set_str_value_list : []
set_enum_str_value_list : []
geometry_type_list : []
simple_primary_key_list: [0]
primary_keys_with_prefix: {}
visibility_list: [True, True]
charset_collation_list: []
enum_and_set_collation_list: []

=== UpdateRowsEvent ===
Date: 2023-09-20T07:54:27
Log position: 1677
Event size: 93
Read bytes: 13
Table: test.setting_table
Affected columns: 2
Changed rows: 1
Column Name Information Flag: True
Values:
--
*id:1=>1
*setting:{b'btn': True, b'model': False}=>{b'btn': True}

=== XidEvent ===
Date: 2023-09-20T07:54:27
Log position: 1708
Event size: 8
Read bytes: 8
Transaction ID: 18192
abdullahmafuz2020 commented 1 year ago

@sean-k1 after enabling "BINLOG_ROW_METADATA = FULL" I have tested bug fix json Parser PR locally in MYSQL DB, and it's working fine. tried a few different test cases, and all seem working !!!🎉

couldn't test on the production database where the actual issue was raised because it required "BINLOG_ROW_METADATA = FULL" to be enabled. To enable metadata setting in prod, it will take some days and to test on a production database.

will test in production and get back to you in a week.

Anyway Thanks so much for your effort to fix this bug

sean-k1 commented 1 year ago

@abdullahmafuz2020 Thanks!

abdullahmafuz2020 commented 1 year ago

hey @sean-k1 I deployed a bug fix JSON parser code from your PR in production job. it worked for 10 hours then the below error popped up it happened in nested JSON {"information": {"purpose": null, "Number": null, "Status": null}}

Traceback (most recent call last):
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/app.py", line 137, in <module>
    main()
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/app.py", line 103, in main
    for row in binlogevent.rows:
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 492, in rows
    self._fetch_rows()
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 487, in _fetch_rows
    self.__rows.append(self._fetch_one_row())
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 578, in _fetch_one_row
    row["before_values"] = self._read_column_data(self.columns_present_bitmap)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 121, in _read_column_data
    values[name] = self.__read_values_name(
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/row_event.py", line 230, in __read_values_name
    return self.packet.read_binary_json(column.length_size)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 382, in read_binary_json
    value = self.read_binary_json_type(t, length)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 390, in read_binary_json_type
    return self.read_binary_json_object(length - 1, large)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 487, in read_binary_json_object
    data = self.read_binary_json_type(t, length)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 390, in read_binary_json_type
    return self.read_binary_json_object(length - 1, large)
  File "/Users/abdullahmohammed/Abdullah/local/sub-folder/util/lazarus_debug_script_mysql_short/pythonmysqlreplication/pymysqlreplication/packet.py", line 450, in read_binary_json_object
    raise ValueError("Json length is larger than packet length")
ValueError: Json length is larger than packet length
sean-k1 commented 1 year ago

@abdullahmafuz2020 Did you run pull to this update branch ? Does it error When update? IF you stuck in partial_update, we need to develop partial_update_row_event first..

=== WriteRowsEvent ===
Date: 2023-09-23T07:34:19
Log position: 1593
Event size: 88
Read bytes: 12
Table: pymysqlreplication_test.json_test
Affected columns: 2
Changed rows: 1
Column Name Information Flag: True
Values:
--
* id : 2
* json_data : {b'information': {b'Number': None, b'Status': None, b'purpose': None}}
abdullahmafuz2020 commented 1 year ago

hey @sean-k1 no, I haven't taken the recent update code from your branch, I am running 3 day ago old code meanwhile, take your recent code and run it there are some issues, I am facing ("{'schema': 'main_table', 'table': 'job_batches', 'binlog_timestamp': " "1695416511, 'fields': {None: None}, 'type': 'update'}")'

not working as expected, missing fields but if I run using a 3-day ago code, fields get proper data. this is how I am initializing BinLogStreamReader locally

    stream = BinLogStreamReader(server_id=5555, 
                                    blocking=True,
                                    connection_settings=MYSQL_SETTINGS,
                                    only_schemas=only_schemas,
                                    # only_tables=sync_tables,
                                    resume_stream=True,
                                    log_file='mysql-bin-changelog.026185',
                                    log_pos=18004628,
                                    only_events=[WriteRowsEvent,
                                                UpdateRowsEvent,
                                                DeleteRowsEvent]
                                    )

please let me know if I am missing some setting/params

sean-k1 commented 1 year ago

@abdullahmafuz2020 I think Setting is not Problem. because 3days code goes properly. Can you let me know Your simple Testcase?

abdullahmafuz2020 commented 1 year ago

@sean-k1 I didn't run any specific test case, I connected the prod replica database directly to the test branch code keeping the log_file and log_pos hard coding locally

sean-k1 commented 1 year ago

@abdullahmafuz2020

Let me summarize my understanding Based on the PR code I wrote 3 days ago, when you run BinlogFile ("{'schema': 'main_table', 'table': 'job_batches', 'binlog_timestamp': " "1695416511, 'fields': {None: None}, 'type': 'update'}")' Here, the value of fields is not None, but it is filled with a value, and when you ran the latest PR code, it came out as None, right?

I don't think it's a problem with the BinlogStreamReader settings, it works fine with the same settings as yours, so I think the code I wrote is in error.

So ("{'schema': 'main_table', 'table': 'job_batches', 'binlog_timestamp': " "1695416511, 'fields': {None: None}, 'type': 'update'}")' I was asking about how to reproduce what might be causing the error in your Prod environment BinlogEvent. Correct me if I understand something wrong!

abdullahmafuz2020 commented 1 year ago

@sean-k1 yes , you understood right I was working on reproducing locally. with recent PR changes, it's happening with any event, update, insert

using setting_table refer above on previous DDL

INSERT INTO db.setting_table (setting) VALUES ('{"btn": true, "model": false}');

new code result

{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472221, "
 "'fields': {None: {b'btn': True, b'model': False}}, 'type': 'insert'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472345, "
 "'fields': {None: {b'btn': False, b'model': False}}, 'type': 'update'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472398, "
 "'fields': {None: {b'btn': True, b'model': False}}, 'type': 'delete'}

3 day ago code result

{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472221, "
 "'fields': {'id': 24, 'setting': {b'btn': True, b'model': False}}, 'type': "
 "'insert'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472345, "
 "'fields': {'id': 18, 'setting': {b'btn': False, b'model': False}}, 'type': "
 "'update'}
{'schema': 'db', 'table': 'setting_table', 'binlog_timestamp': 1695472398, "
 "'fields': {'id': 19, 'setting': {b'btn': True, b'model': False}}, 'type': "
 "'delete'}
abdullahmafuz2020 commented 1 year ago

@sean-k1 I updated the code output result, I misplaced the result as was running 2 codes in 2 different terminals.😕 please have a look at result again

sean-k1 commented 1 year ago

@abdullahmafuz2020 I think another pr merged main branch .. this one is problem. Can you wait Just moment?

sean-k1 commented 1 year ago

@abdullahmafuz2020 so sorry can you run latest pr branch I'm missing something while merging a conflicted branch

abdullahmafuz2020 commented 1 year ago

seems working fine now 🚀 and Json length is larger than packet length issue is not coming so far will deploy this branch in prod, and will update here if any issue pops up thanks @sean-k1

noman55 commented 1 year ago

Hi @abdullahmafuz2020 @sean-k1 Let me also deploy and test it. Sorry was occupied with something else couldn't get back earlier. Just to be sure I need to deploy from this PR branch right? https://github.com/julien-duponchelle/python-mysql-replication/pull/481/files

abdullahmafuz2020 commented 1 year ago

yes @noman55

sean-k1 commented 1 year ago

@noman55 yes but some partial_update is not supported and time,timestamp is still develop now

Remind this one

sean-k1 commented 1 year ago

@noman55 oh your mysqlserver is 5.7 I think Column name is None..

sean-k1 commented 1 year ago

@noman55 @abdullahmafuz2020 I develop time, timestamp but i said If you update json property UpdateEvent Cannot find.

We would love to develop a PARTIAL_UPDATE_ROW_EVENT right away, but it will take some time to implement it. Hope you understand.

If you don't get it from the changed data in the json update, but just know that a partial update occurred, I'll add it to the PR.

abdullahmafuz2020 commented 1 year ago

@sean-k1 thanks for letting us know.

sean-k1 commented 1 year ago

@abdullahmafuz2020 we merged this pr on main branch! check this out I fixed when long string cases too. I hope this PR helps

sean-k1 commented 1 year ago

@noman55 we release 0.45.0 version for Mysql 5.7 Users I think this version resolved this Issue!

noman55 commented 3 months ago

@sean-k1 I am getting the same error for MySQL8.7 version

  'mysql-replication==0.25',
  'pymysql==0.9.3'

Can you guide what version I should use to resolve this issue? Thanks a lot

noman55 commented 3 months ago

@sean-k1 can you look into this pls?

sean-k1 commented 3 months ago

@noman55 You have two choices First, if you're using mysql 8 and you're still using 0.25, I don't think you need DDL tracing. In that case, you can use version 0.45.1.

Second, if you are using mysql 8 and you need DDL tracing, you can use the set binlog_row_metadata='FULL' , binlog_row_image='FULL' and then set 1.0.9 should work.

If you don't want to set these global variables, you can proceed with the first case