kagxin / blog

个人博客:技术、随笔、生活
https://github.com/kagxin/blog/issues
7 stars 0 forks source link

Mysql 大数据量分页讨论 #56

Open kagxin opened 4 years ago

kagxin commented 4 years ago

讨论的前提

表结构

分页sql

select * from history where is_delete=0 order by id limit 5000000, 20;

explain

+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+---------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys   | key             | key_len | ref   | rows    | filtered | Extra       |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+---------+----------+-------------+
|  1 | SIMPLE      | history | NULL       | ref  | is_delete_index | is_delete_index | 1       | const | 4878971 |   100.00 | Using index |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+---------+----------+-------------+

为什么会慢

因为随着offset(偏移量)的增加,mysql无法跳过offset的行,需要花大量时间来扫描那些需要丢弃的数据。

延迟关联

《高性能MySQL》中提了一个方法这样改写这条查询sql

select * from history h 
    inner join (select id from history where is_delete=0 order by id limit 5000000, 20) h2 
    on h.id=h2.id;

explain

+----+-------------+------------+------------+--------+-----------------+-----------------+---------+-------+---------+----------+-------------+
| id | select_type | table      | partitions | type   | possible_keys   | key             | key_len | ref   | rows    | filtered | Extra       |
+----+-------------+------------+------------+--------+-----------------+-----------------+---------+-------+---------+----------+-------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL    | NULL            | NULL            | NULL    | NULL  | 4878971 |   100.00 | NULL        |
|  1 | PRIMARY     | h          | NULL       | eq_ref | PRIMARY         | PRIMARY         | 4       | h2.id |       1 |   100.00 | NULL        |
|  2 | DERIVED     | history    | NULL       | ref    | is_delete_index | is_delete_index | 1       | const | 4878971 |   100.00 | Using index |
+----+-------------+------------+------------+--------+-----------------+-----------------+---------+-------+---------+----------+-------------+

通过覆盖索引查询查出需要的主键,再根据主键关联原表获取需要的行,减少了扫描的行和数据量。

业务只允许查询offset较小的数据

后台查询历史记录,可以等不在意时长

愿意等

其他

ref: 《高性能Mysql 第三版》5.4.3节 table:derived2 解释