Koed00 / django-q

A multiprocessing distributed task queue for Django
https://django-q.readthedocs.org
MIT License
1.83k stars 289 forks source link

Indexes in the table django_q_schedule #623

Open prd-hai-huynh opened 3 years ago

prd-hai-huynh commented 3 years ago

Hi,

We're getting bad performance when the below query is executed every 30s. Could you please tell me if we can have any indexes in the table django_q_schedule? (Because the query is executed inside django-q, so we have no choice but to make indexes.)

Query: https://github.com/Koed00/django-q/blob/master/django_q/cluster.py#L589-L594

Schedule.objects.select_for_update()
                .exclude(repeats=0)
                .filter(next_run__lt=timezone.now())
                .filter(
                    db.models.Q(cluster__isnull=True) | db.models.Q(cluster=Conf.PREFIX)
                )

Raw query:

SELECT 
    `django_q_schedule`.`id`,
    `django_q_schedule`.`name`,
    `django_q_schedule`.`func`,
    `django_q_schedule`.`hook`,
    `django_q_schedule`.`args`,
    `django_q_schedule`.`kwargs`,
    `django_q_schedule`.`schedule_type`,
    `django_q_schedule`.`minutes`,
    `django_q_schedule`.`repeats`,
    `django_q_schedule`.`next_run`,
    `django_q_schedule`.`task`
FROM
    `django_q_schedule`
WHERE
    (NOT (`django_q_schedule`.`repeats` = 0)
        AND `django_q_schedule`.`next_run` < '2021-10-28 16:35:26.460229')
ORDER BY `django_q_schedule`.`next_run` ASC
FOR UPDATE

The number of records of our database is increasing rapidly and the query causes bad performance to RDS:

mysql> select count(1) from django_q_schedule;
+----------+
| count(1) |
+----------+
|   341107 |
+----------+
1 row in set (0.36 sec)

Our django-q configuration:

QConfiguration = {
    "name": "name",
    "workers": NUMBER_OF_WORKERS,
    "recycle": 500,
    "timeout": 600,
    "compress": True,
    "save_limit": 250,
    "queue_limit": 500,
    "cpu_affinity": 1,
    "label": "Django Q",
    "retry": 10000,
    "redis": {
        "host": REDIS_HOST,
        "port": REDIS_PORT,
        "db": 0,
        "password": REDIS_PWD,
        "ssl": True,
    },
}

django-q version:

django-q==1.2.1

Thank you very much,

nickodell commented 3 years ago

Could you please tell me if we can have any indexes in the table django_q_schedule?

I don't believe so, other than an index on the primary key. Here's what SHOW INDEX shows on my system:

mysql> SHOW INDEX FROM django_q_schedule;
+-------------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| django_q_schedule |          0 | PRIMARY  |            1 | id          | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
+-------------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)

The number of records of our database is increasing rapidly and the query causes bad performance to RDS:

I'm surprised that you have so many entries in the schedule table. How are you using the schedule feature?

prd-hai-huynh commented 3 years ago

I'm surprised that you have so many entries in the schedule table. How are you using the schedule feature?

@nickodell,

  1. We're working on a large number of records(let's say [item]),
  2. each item requires a schedule with a different next_run value.
  3. Some require to run ONCE, some require to run DAILY, some need to run today but some need to run in the future(maybe for the next year)
  4. The number of [item] is increasing rapidly over time, that is why we have a large number of records in django_q_schedule.