django-commons / django-tasks-scheduler

Schedule async tasks using redis protocol. Redis/ValKey/Dragonfly or any broker using the redis protocol can be used.
https://django-tasks-scheduler.readthedocs.io/
MIT License
58 stars 11 forks source link

Duplicate scheduled jobs #166

Closed 1vank1n closed 2 months ago

1vank1n commented 2 months ago

Describe the bug I am experiencing an issue where scheduled tasks using django-tasks-scheduler are being duplicated infinitely every few seconds, even though they are scheduled to run only once a day at a specific time (e.g., 23:59). The scheduled execution time remains the same for all duplicates.

It appears that the method is_scheduled in BaseTask is returning False when it should return True, causing the scheduler to believe the task is not scheduled and to reschedule it repeatedly.

To Reproduce Steps to reproduce the behavior:

  1. Create a RepeatableTask scheduled to run once a day at a specific time (e.g., 23:59).
  2. Start the scheduler and worker processes.
  3. Observe the logs and Redis registries.
  4. Notice that the task is being duplicated every few seconds with the same scheduled execution time.

Expected behavior I expect the task to be scheduled once and wait until its scheduled execution time before running. The task should not be duplicated multiple times before its execution time.

Screenshots 2024-09-20_09-09-59 2024-09-20_09-06-17

Desktop:

OS: Ubuntu 22.04 Python version: 3.12.6 Django version: 5.1 Redis: 6.0.16

packages [tool.poetry.dependencies] beautifulsoup4 = "^4.12.2" channels = "4.1.0" channels-redis = "^4.1.0" daphne = "^4.1.2" dataclasses-json = "~0.6.7" django = "5.1" django-admin-sortable2 = "2.2.2" django-auditlog = "^3.0.0" django-bootstrap5 = "^23.3" django-braces = "~1.14.0" django-ckeditor = "~6.7.0" django-cleanup = "8.1.0" django-colorfield = "^0.11.0" django-colorful = "~1.3.0" django-compressor = "4.5.1" django-cors-headers = "^4.4.0" django-debug-toolbar = "4.4.6" django-extensions = "~3.2.3" django-extra-views = "0.14.0" django-filter = "~24.3" django-import-export = "^4.1.1" django-ipware = "~7.0.1" django-json-widget = "~2.0.1" django-loginas = "~0.3.9" django-model-utils = "~4.5.1" django-object-actions = "~4.2.0" django-querycount = "^0.8.3" django-redis = "~5.4.0" django-select2 = "~8.2.0" django-split-settings = "^1.3.2" django-spurl = "~0.6.8" django-structlog = "^8.1.0" django-tasks-scheduler = "^2.1.0" django-timezone-field = "~7.0" django-treebeard = "4.7.1" djangorestframework = "^3.15.2" docutils = "^0.21.2" easy-thumbnails = "^2.9" faker = "^19.10.0" ipython = "^8.14.0" json2html = "~1.3.0" mimesis = "~4.1.0" munch = "~2.5.0" netaddr = "~0.8.0" openpyxl = "^3.1.2" pandas = "~2.1.0" pdfkit = "~0.6.1" phonenumbers = "~8.12.41" pillow = "10.0.0" psycopg2-binary = "~2.9.4" pydantic = "^2.1.1" pytest-cov = "^4.1.0" pytest-mock = "^3.12.0" pytest-randomly = "^3.15.0" python = "^3.12" pyyaml = "~6.0.0" requests = "~2.28.2" requests-mock = "1.11.0" sentry-sdk = "2.13.0" singlemodeladmin = "~0.9.0" xlsxwriter = "~1.2.9"

Additional context I have tried several troubleshooting steps, including:

Ensuring that the system time and Django TIME_ZONE settings are consistent (Europe/Moscow), and that USE_TZ = True. Adding detailed logging to the is_scheduled method in BaseTask. The logs indicate that job_id is not found in any registry, even though the task appears to be scheduled in Redis. Here are some logs illustrating the issue:

job_id is None, task is not scheduled.
Checking if task 'My Daily Task' is scheduled.
job_id: default:My Daily Task:5319a21038
Scheduled jobs: ['default:My Daily Task:08b498f572', 'default:My Daily Task:c196ea97a4']
Enqueued jobs: []
Active jobs: []
Is job_id in any registry: False
Job_id not found in any registry, setting job_id to None.
Rescheduling RepeatableTask[My Daily Task=my_app.tasks.my_task()]
It seems that each time the task is rescheduled, a new job_id is generated, and the scheduler cannot find the previous job_id in the registries, causing it to believe the task is not scheduled.

But when I connect to shell and try manually execute repeatable_task.schedule() it's work correct.

It would be helpful if you could advise on whether this is a known issue or if there is a recommended fix.

Thank you for your assistance.

1vank1n commented 2 months ago

I would greatly appreciate it if you could provide me with a solution or at least a direction on how to solve the problem. Before the issue was published, I had spent 8 hours trying to find a solution on my own, but to no avail.

1vank1n commented 2 months ago

Resolved. Sorry to bother.

cunla commented 2 months ago

Do you mind sharing how you resolved? I would want to prevent others from experiencing it.

1vank1n commented 2 months ago

Do you mind sharing how you resolved? I would want to prevent others from experiencing it.

Of course! Another worker was connected to the database on another remote instance (forgotten dev server). As a result, when saving the job_id in the Task, it did not match the job_id saved in redis. Therefore, the is_scheduled method was always False and a new job was created every tick.

cunla commented 2 months ago

I see. I will add some logs to make sure it is visible.