ddcw / ibd2sql

parse mysql ibd file to sql for learn or recovery data
GNU General Public License v3.0
208 stars 62 forks source link

大佬,为什么一直提示这里报错 #43

Open baobeiyatou opened 6 days ago

baobeiyatou commented 6 days ago

[2024-11-07 14:25:06] [DEBUG] DEBUG MODE ON [2024-11-07 14:25:06] [DEBUG] INIT ibd2sql [2024-11-07 14:25:06] [DEBUG] FORCE False [2024-11-07 14:25:06] [DEBUG] SET False [2024-11-07 14:25:06] [DEBUG] MULTIVALUE False [2024-11-07 14:25:06] [DEBUG] AUTO_DEBUG True [2024-11-07 14:25:06] [DEBUG] FILTER:

    0 < TRX < 281474976710656
    0 < ROLLPTR < 72057594037927936

[2024-11-07 14:25:06] [DEBUG] OPEN IBD FILE: message_session_relation_2024.ibd [2024-11-07 14:25:06] [DEBUG] ANALYZE SDI PAGE Traceback (most recent call last): File "F:\test\ibd2sql-main\main.py", line 144, in ddcw.init() File "F:\test\ibd2sql-main\ibd2sql\ibd2sql.py", line 125, in init self.PAGE_ID = sdino ^^^^^ UnboundLocalError: cannot access local variable 'sdino' where it is not associated with a value

ddcw commented 6 days ago

该报错是使用了 --mysql5 但是没使用 --sdi-table 导致的.

  1. 解析8.0的ibd文件不需要 --mysql5
  2. 解析5.7的ibd文件需要 先在8.0里面创建一个表结构一样的空表, 然后加上--sdi-table=/mysql_8/xxx.ibd 来指定元数据信息.即
    python3 main.py mysql_5.7.ibd --sdi-table /mysql_8/xxx.ibd --mysql5 --sql
baobeiyatou commented 6 days ago

该报错已使用--mysql5但未使用--sdi-table导致的。

  1. 解析8.0的ibd文件不需要--mysql5
  2. 解析5.7的ibd文件需要先在8.0里面创建一个表结构一样的空表,然后加上--sdi-table=/mysql_8/xxx.ibd来指定元数据信息。即
python3 main.py mysql_5.7.ibd --sdi-table /mysql_8/xxx.ibd --mysql5 --sql

还得再麻烦您一下,按照您给的我操作之后,提示以下错误,您方便在帮我看一下吗? [2024-11-07 15:20:30] [DEBUG] NEW TABLENAME:aabbcc.aa [2024-11-07 15:20:30] [DEBUG] DEBUG MODE ON [2024-11-07 15:20:30] [DEBUG] INIT ibd2sql [2024-11-07 15:20:30] [DEBUG] FORCE False [2024-11-07 15:20:30] [DEBUG] SET False [2024-11-07 15:20:30] [DEBUG] MULTIVALUE False [2024-11-07 15:20:30] [DEBUG] AUTO_DEBUG True [2024-11-07 15:20:30] [DEBUG] FILTER:

    0 < TRX < 281474976710656
    0 < ROLLPTR < 72057594037927936

[2024-11-07 15:20:30] [DEBUG] OPEN IBD FILE: message_session_relation_2024.ibd [2024-11-07 15:20:30] [DEBUG] THIS TABLE IS PARTITION TABLE [2024-11-07 15:20:30] [DEBUG] ANALYZE PAGE INODE (PAGE_ID=2) (for get index) [2024-11-07 15:20:30] [DEBUG] ibd2sql.read PAGE: 2 [2024-11-07 15:20:30] [DEBUG] FIRST INDEX (Non-leaf and leaf page) : (3, 6) (-1 is None) [2024-11-07 15:20:30] [DEBUG] INIT FIRST PAGE TO FIRST_NO_LEAFPAGE (3) [2024-11-07 15:20:30] [DEBUG] COUNT: 1 FIND LEAF PAGE, CURRENT PAGE ID: 3 [2024-11-07 15:20:30] [DEBUG] ibd2sql.read PAGE: 3 [2024-11-07 15:20:30] [DEBUG] INIT ROW BASE INFO [2024-11-07 15:20:30] [DEBUG] NULL BITMASK LENGTH: 3 bytes. (nullable col count:18) [2024-11-07 15:20:30] [DEBUG] ######################################## FIELD INFO START #################################### [2024-11-07 15:20:30] [DEBUG] name: ID type: varchar(64) size: 0 isvar: True is_nullable: False [2024-11-07 15:20:30] [DEBUG] name: PROC_DEFID type: varchar(64) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: TASK_DEFKEY type: varchar(255) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: PROC_INSTID type: varchar(64) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: EXECUTIONID type: varchar(64) size: 0 isvar: True isnullable: True [2024-11-07 15:20:30] [DEBUG] name: NAME type: varchar(255) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: PARENT_TASKID type: varchar(64) size: 0 isvar: True isnullable: True [2024-11-07 15:20:30] [DEBUG] name: DESCRIPTION type: varchar(4000) size: 0 isvar: True isnullable: True [2024-11-07 15:20:30] [DEBUG] name: OWNER type: varchar(255) size: 0 isvar: True isnullable: True [2024-11-07 15:20:30] [DEBUG] name: ASSIGNEE type: varchar(255) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: STARTTIME type: datetime(3) size: 7 isvar: False is_nullable: False [2024-11-07 15:20:30] [DEBUG] name: CLAIMTIME type: datetime(3) size: 7 isvar: False is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: ENDTIME type: datetime(3) size: 7 isvar: False isnullable: True [2024-11-07 15:20:30] [DEBUG] name: DURATION type: bigint size: 8 isvar: False is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: DELETEREASON type: varchar(4000) size: 0 isvar: True isnullable: True [2024-11-07 15:20:30] [DEBUG] name: PRIORITY type: int size: 4 isvar: False is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: DUEDATE type: datetime(3) size: 7 isvar: False is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: FORMKEY type: varchar(255) size: 0 isvar: True isnullable: True [2024-11-07 15:20:30] [DEBUG] name: CATEGORY type: varchar(255) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] name: TENANTID type: varchar(255) size: 0 isvar: True is_nullable: True [2024-11-07 15:20:30] [DEBUG] ######################################## FIELD INFO END ###################################### [2024-11-07 15:20:30] [DEBUG] CLUSTER INDEX: IDXNO: 1 IDX COLUMMN COUNT:1 INDEX ELEMENT:[1] [2024-11-07 15:20:30] [DEBUG] ROW INIT FINISH FOR < aabbcc.aa >

[2024-11-07 15:20:30] [DEBUG] CURRENT PAGE ID(find leaf page): 3 [2024-11-07 15:20:30] [DEBUG] FIND LEAF PAGE ----> OFFSET:99 RECORD TYPE:2 [2024-11-07 15:20:30] [DEBUG] FIND LEAF PAGE ----> OFFSET:127 RECORD TYPE:1 [2024-11-07 15:20:30] [DEBUG] BLOB ERROR 'utf-8' codec can't decode byte 0x81 in position 42: invalid start byte [2024-11-07 15:20:30] [DEBUG] 127 ----> 244 data:0x30366131643763383733343234616131623634306361343263656639616635320000003820000500190081376432646664326335373566343130326263303063396630386139373034653300000064200000002100ac63313235353638613831383234306263383031653035396366323635323430 bdata:b'06a1d7c873424aa1b640ca42cef9af52\x00\x00\x008 \x00\x05\x00\x19\x00\x817d2dfd2c575f4102bc00c9f08a9704e3\x00\x00\x00d \x00\x00\x00!\x00\xacc125568a818240bc801e059cf265240' [2024-11-07 15:20:30] [DEBUG] COUNT: 2 FIND LEAF PAGE, CURRENT PAGE ID: 922746880 [2024-11-07 15:20:30] [DEBUG] ibd2sql.read PAGE: 922746880 Traceback (most recent call last): File "F:\test\ibd2sql-main\main.py", line 144, in ddcw.init() File "F:\test\ibd2sql-main\ibd2sql\ibd2sql.py", line 152, in init self.init_first_leaf_page() File "F:\test\ibd2sql-main\ibd2sql\ibd2sql.py", line 166, in init_first_leaf_page aa = find_leafpage(self.read(),table=self.table, idx=self.table.cluster_index_id, debug=self.debug) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\test\ibd2sql-main\ibd2sql\innodb_page_index.py", line 694, in init super().init(*args,*kwargs) File "F:\test\ibd2sql-main\ibd2sql\innodb_page_index.py", line 97, in init super().init(args,**kwargs) File "F:\test\ibd2sql-main\ibd2sql\innodb_page.py", line 176, in init self.FIL_PAGE_SPACE_OR_CHKSUM, self.FIL_PAGE_OFFSET, self.FIL_PAGE_PREV, self.FIL_PAGE_NEXT, self.FIL_PAGE_LSN, self.FIL_PAGE_TYPE, self.FIL_PAGE_FILE_FLUSH_LSN = struct.unpack('>4LQHQ',bdata[:34])

                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

struct.error: unpack requires a buffer of 34 bytes

ddcw commented 6 days ago

看起来是BUG (解析varchar类型的时候的问题, 估计是nullable或者啥之类的数量错位了, 之前有过类似的BUG),

  1. 如果当前不是最新版的话, 可以先下载最新版测试下.
  2. 或者提供下相关脱敏之后的DDL, 我看看能不能复现
baobeiyatou commented 6 days ago

看起来是BUG (解析varchar类型的时候的问题, 估计是nullable或者啥之类的数量错位了, 之前有过类似的BUG),

  1. 如果当前不是最新版的话, 可以先下载最新版测试下.
  2. 或者提供下相关脱敏之后的DDL, 我看看能不能复现

大佬,重新执行了一下,读取出了一些数据,也丢失了很多数据,我想问一下这些.ibd中无法读取的数据还能恢复吗?这里面,aa.ibd是8.0的,另一个是5.7.44, init.zip

ddcw commented 6 days ago
  1. 我这边能解析出来22W条数据.

    python3 main.py /tmp/ibd2sql_test_db/xxxx.ibd --sdi-table /tmp/ibd2sql_test_db/aa.ibd --mysql5 --sql > /tmp/t20241107_test.sql
  2. 导入到数据库里面查询总数据量24W左右, 但mysqldump只能到22W数据出来, 然后数据库就gg了

  3. 强制解析出来有24W数据.命令参考如下,

    filename="/tmp/ibd2sql_test_db/xxxx.ibd" # 要解析的ibd文件名
    python3 main.py ${filename} --ddl # 获取表结构信息
    filesize=`stat -c %s ${filename}`
    maxpagecount=$[ ${filesize} / 16384 ]
    current_page=1
    while [ ${current_page} -le ${maxpagecount} ];do
        echo "-- ${filename} PAGE NO: ${current_page}";
        current_page=$[ ${current_page} + 1 ]
        python3 main.py ${filename} --sdi-table /tmp/ibd2sql_test_db/aa.ibd --mysql5 --sql --page-start ${current_page} --page-count 1 2>/dev/null | grep "'流程:"  >> /tmp/t20241107_recovery_1.sql ;
    done
  4. 校验也没发现坏块. (不排除hash碰撞...)

baobeiyatou commented 6 days ago
  1. 我这边能解析出来22W条数据.
python3 main.py /tmp/ibd2sql_test_db/xxxx.ibd --sdi-table /tmp/ibd2sql_test_db/aa.ibd --mysql5 --sql > /tmp/t20241107_test.sql
  1. 导入到数据库里面查询总数据量24W左右, 但mysqldump只能到22W数据出来, 然后数据库就gg了
  2. 强制解析出来有24W数据.命令参考如下,
filename="/tmp/ibd2sql_test_db/xxxx.ibd" # 要解析的ibd文件名
python3 main.py ${filename} --ddl # 获取表结构信息
filesize=`stat -c %s ${filename}`
maxpagecount=$[ ${filesize} / 16384 ]
current_page=1
while [ ${current_page} -le ${maxpagecount} ];do
        echo "-- ${filename} PAGE NO: ${current_page}";
        current_page=$[ ${current_page} + 1 ]
        python3 main.py ${filename} --sdi-table /tmp/ibd2sql_test_db/aa.ibd --mysql5 --sql --page-start ${current_page} --page-count 1 2>/dev/null | grep "'流程:"  >> /tmp/t20241107_recovery_1.sql ;
done
  1. 校验也没发现坏块. (不排除hash碰撞...)

大佬,按照您提供的第三步我创建了一个aaa.py文件,放在了ibd2sql-main文件夹中,启动就提示以下问题, [root@localhost ibd2sql-main]# python3 aaa.py File "/tmp/mysql/ibd2sql-main/aaa.py", line 2 python3 main.py ${filename} --ddl # 获取表结构信息 ^ SyntaxError: invalid syntax

aaa.py 代码 filename="/tmp/mysql/sql/act_hi_taskinst.ibd" # 要解析的ibd文件名 python3 main.py ${filename} --ddl # 获取表结构信息 filesize='stat -c %s ${filename}' maxpagecount=$[ ${filesize} / 16384 ] current_page=1 while [ ${current_page} -le ${maxpagecount} ];do echo "-- ${filename} PAGE NO: ${current_page}"; current_page=$[ ${current_page} + 1 ] python3 main.py ${filename} --sdi-table /tmp/mysql/sql/8/aa.ibd --mysql5 --sql --page-start ${current_page} --page-count 1 2>/dev/null | grep "'流程:" >> /tmp/mysql/sql/t20241107_recovery_1.sql ; done

ddcw commented 6 days ago

第三步那是shell命令. 直接修改第一行的文件路径 和 倒数第二行中的文件路径就行. 然后在shell环境执行即可. 1730975018951

baobeiyatou commented 6 days ago

第三步那是shell命令. 直接修改第一行的文件路径 和 倒数第二行中的文件路径就行. 然后在shell环境执行即可. 1730975018951

感谢大佬,太感谢了,数据正在处理中