inspirehep / invenio

Invenio digital library software, INSPIRE OPS version
http://invenio-software.org/
GNU General Public License v2.0
3 stars 10 forks source link

BibFormat: tune catchup options #562

Closed tsgit closed 3 years ago

tsgit commented 3 years ago

reduce frequency of expensive catchup and only cover past 120 days (which is plenty) this avoids lengthy full table scan in mysql

4 months uses range

MySQL [inspirelegacy]> explain select br.id from bibrec as br inner join bibfmt as bf on bf.id_bibrec = br.id where br.modification_date > "2020-08-05 0
+----+-------------+-------+------------+--------+-----------------------------+-------------------+---------+---------------------------+--------+-----
| id | select_type | table | partitions | type   | possible_keys               | key               | key_len | ref                       | rows   | filt
+----+-------------+-------+------------+--------+-----------------------------+-------------------+---------+---------------------------+--------+-----
|  1 | SIMPLE      | br    | NULL       | range  | PRIMARY,modification_date   | modification_date | 5       | NULL                      | 124895 |   10
|  1 | SIMPLE      | bf    | NULL       | eq_ref | PRIMARY,format,last_updated | PRIMARY           | 36      | inspirelegacy.br.id,const |      1 |    3
+----+-------------+-------+------------+--------+-----------------------------+-------------------+---------+---------------------------+--------+-----
2 rows in set, 1 warning (0.00 sec)

1 year uses ALL -> 1.8 million records to scan

MySQL [inspirelegacy]> explain select br.id from bibrec as br inner join bibfmt as bf on bf.id_bibrec = br.id where br.modification_date > "2020-02-05 0
+----+-------------+-------+------------+--------+-----------------------------+---------+---------+---------------------------+---------+----------+---
| id | select_type | table | partitions | type   | possible_keys               | key     | key_len | ref                       | rows    | filtered | Ex
+----+-------------+-------+------------+--------+-----------------------------+---------+---------+---------------------------+---------+----------+---
|  1 | SIMPLE      | br    | NULL       | ALL    | PRIMARY,modification_date   | NULL    | NULL    | NULL                      | 1839427 |    18.38 | Us
|  1 | SIMPLE      | bf    | NULL       | eq_ref | PRIMARY,format,last_updated | PRIMARY | 36      | inspirelegacy.br.id,const |       1 |    33.33 | Us
+----+-------------+-------+------------+--------+-----------------------------+---------+---------+---------------------------+---------+----------+---
2 rows in set, 1 warning (0.01 sec)

Signed-off-by: Thorsten Schwander thorsten.schwander@gmail.com