matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.78k stars 275 forks source link

[Bug]: `mo_table_rows` returns different result with `select count(*)` #17980

Open DanielZhangQD opened 2 months ago

DanielZhangQD commented 2 months ago

Is there an existing issue for the same bug?

Branch Name

1.2-dev

Commit ID

v1.2.2-b44e7327a-2024-07-15

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

mysql> select count(*) from real_time_position2;
+-----------+
| count(*)  |
+-----------+
| 104063781 |
+-----------+
1 row in set (0.05 sec)

mysql> select count(*) from real_time_position3;
+-----------+
| count(*)  |
+-----------+
| 817609625 |
+-----------+
mysql> SELECT relname AS `name`, relkind AS `kind`, mo_table_rows(reldatabase, relname) AS `rows`, mo_table_size(reldatabase, relname) AS `size`, if (role_name IS NULL, '-', role_name) AS `owner`, mo_tables.created_time AS created FROM mo_catalog.mo_tables LEFT JOIN mo_catalog.mo_role ON mo_catalog.mo_tables.owner=role_id WHERE relkind IN ('r','e','cluster') AND reldatabase = 'cloud_device' Order by name;
+-----------------------------+------+------------+-------------+--------------+---------------------+
| name                        | kind | rows       | size        | owner        | created             |
+-----------------------------+------+------------+-------------+--------------+---------------------+
| real_time_position          | r    |     892357 |    43891657 | accountadmin | 2024-08-08 09:56:29 |
| real_time_position2         | r    |          0 |   851828828 | accountadmin | 2024-03-07 11:35:18 |
| real_time_position3         | r    |          0 |  6573783299 | accountadmin | 2024-01-24 00:11:11 |
| real_time_position_20240808 | r    | 1870587359 | 81454477562 | accountadmin | 2024-03-13 06:12:30 |

Expected Behavior

mo_table_rows returns the same result with select count(*)

Steps to Reproduce

Login to the IOT demo instance in prod and execute the above sql

Additional information

No response

gouhongshen commented 2 months ago

select reldatabase_id, rel_id, relname from mo_catalog.mo_tables where reldatabase="cloud_device" and relname like "%real_time%";
+----------------+---------+-----------------------------+
| reldatabase_id | rel_id  | relname                     |
+----------------+---------+-----------------------------+
|         412255 | 1105947 | real_time_position          |
|         412255 |  449608 | real_time_position2         |
|         412255 |  449610 | real_time_position_20240808 |
|         412255 |  412296 | real_time_position3         |
+----------------+---------+-----------------------------+
4 rows in set (0.08 sec)

2024-08-14 16:58:50.270 | {"level":"ERROR","time":"2024/08/14 08:58:50.270400 +0000","caller":"disttae/stats.go:479","msg":"cannot get table by ID {412255 412296}"}   

2024-08-14 16:58:41.248 | {"level":"ERROR","time":"2024/08/14 08:58:41.248247 +0000","caller":"disttae/stats.go:479","msg":"cannot get table by ID {412255 412296}"} 
2024-08-14 16:58:26.233 | {"level":"ERROR","time":"2024/08/14 08:58:26.233910 +0000","caller":"disttae/stats.go:479","msg":"cannot get table by ID {412255 449608}"}   
2024-08-14 16:58:26.233 | {"level":"ERROR","time":"2024/08/14 08:58:26.233910 +0000","caller":"disttae/stats.go:479","msg":"cannot get table by ID {412255 412296}"}

存在这样的错误,使得无法获取 stats,rows 就为 0 了。

其实 real_time_position2,real_time_position3 这两张表的 mo_table_sizemo_table_rows 都遇到了这个问题,为啥 size 不为零呢?因为 table size = origin table size + index table size,显示出的 size 都是 index 的 size,其本表的 size 返回的依然是 0. mo_table_rows 没有这个修改,只统计本表的 rows,所以就是 0.

gouhongshen commented 2 months ago

问题找到了,alter table rename 升级处理有问题。导致在 cc.tables.data 中找不到rename 过的 table.

也许不是升级的问题,也可能是早期存在的一个 bug 被 fix 了或者 ignore 了,任何早于这个时间段的 rename 都会出问题。

step to repro:

0. switch to v1.1.1, create table, insert rows, alter table rename. checkpoint.
1. switch to v1.2.0, mo_table_rows ==> 0

也许在不同的中间版本中也需要 insert 一些数据和落盘。 
gouhongshen commented 2 months ago

在做 tombstone 相关