celery / django-celery-results

Celery result back end with django
Other
668 stars 206 forks source link

Unable to Catch Signals for TaskResult Updates with RESULT_BACKEND=django-db #433

Open urzbs opened 3 weeks ago

urzbs commented 3 weeks ago

Problem Description:

When using RESULT_BACKEND = django-db on remote Celery workers, Django fails to catch signals on the TaskResult object when workers update their results. This prevents the expected behavior of triggering actions upon task result updates.

Details:

The issue arises because Celery workers are bypassing Django ORM and directly updating the database using PostgreSQL queries. Consequently, signals like post_save on TaskResult instances are not triggered as expected.

Code Example:

Using a signal to create the PENDING Task: Related Issue

@receiver(post_save, sender=TaskResult)
def task_result_saved(sender, instance, created, **kwargs):
    if created:
        logger.info(f'TaskResult created: {instance.task_id} with status {instance.status}')
    else:
        # This branch will not be triggered due to direct database updates
        logger.info(f'TaskResult updated: {instance.task_id} with status {instance.status}')

Current Workaround:

A Cronjob executes a manage.py command periodically to check for state changes, which is inefficient and not ideal.

Other Attempts:

Utilizing Celery Signals: Attempted to use celery task_success signal in signals.py, but it did not function as expected (did not function at all).

from celery.signals import task_success

@task_success.connect
def task_success_handler(sender=None, result=None, **kwargs):
   logger.info(f'Task {sender.name} succeeded with result {result}')

Another idea I had is to create a REST endpoint based on TaskResult, that enables workers to update the model without using PSQL direct queries, but that would make RESULT_BACKEND=django-db pretty much obsolete.


I was hoping someone has a solution for this which is already something built-in.


hetvi01 commented 3 weeks ago

have a similar configuration, and it's working fine for me.

Make sure your signals.py file is imported in the ready function of the apps.py file for your Django app, and ensure the app is registered in INSTALLED_APPS.

Additionally, verify that your signal is being called by manually creating an object in the TaskResult table.

urzbs commented 3 weeks ago

Thank you for your response.

My signals are indeed imported in my apps.py, and other signals like celery.signals.before_task_publish are working as expected on my task producer (MACHINE A).

Upon further investigation into celery.signals.task_success, I've noticed that Celery signals are only caught by the app/worker executing the Celery functions. This also explains why celery.signals.before_task_publish is working on MACHINE A, but celery.signals.task_success is not.

image

However, my requirement is to catch a signal on MACHINE A when a task is updated, as MACHINE A needs to make decisions on how to proceed, rather than the worker.

So I guess the only way of doing this is using an API ENDPOINT on MACHINE A, that gets triggered from MACHINE B (Worker)?