mher / flower

Real-time monitor and web admin for Celery distributed task queue
https://flower.readthedocs.io
Other
6.45k stars 1.09k forks source link

/api/tasks fails on KeyError #648

Open akoren opened 7 years ago

akoren commented 7 years ago

When running /api/tasks, at some point i start getting 500, error stack is:

Uncaught exception GET /api/tasks (10.40.30.136) HTTPServerRequest(protocol='http', host='dev-sh01-event-mngr-01-aws.cloudyn.com:5555', method='GET', uri='/api/tasks', version='H 'Host': 'dev-sh01-event-mngr-01-aws.cloudyn.com:5555', 'Connection': 'keep-alive', 'Accept': '/', 'User-Agent': 'python-requests/2 Traceback (most recent call last): File "/usr/local/lib64/python2.7/site-packages/tornado/web.py", line 1467, in _execute result = method(*self.path_args, *self.path_kwargs) File "/usr/local/lib64/python2.7/site-packages/tornado/web.py", line 2829, in wrapper return method(self, args, **kwargs) File "/usr/local/lib/python2.7/site-packages/flower/api/tasks.py", line 335, in get task = task.as_dict() File "/usr/local/lib/python2.7/site-packages/celery/events/state.py", line 358, in as_dict k: handler(k, pass1)(get(self, k)) for k in self._fields File "/usr/local/lib/python2.7/site-packages/celery/events/state.py", line 358, in k: handler(k, pass1)(get(self, k)) for k in self._fields File "/usr/local/lib/python2.7/site-packages/kombu/utils/objects.py", line 44, in get value = obj.dict[self.name] = self.__get(obj) File "/usr/local/lib/python2.7/site-packages/celery/events/state.py", line 391, in root return self.root_id and self.cluster_state.tasks[self.root_id] File "/usr/local/lib/python2.7/site-packages/kombu/utils/functional.py", line 65, in getitem value = self[key] = self.data.pop(key) File "/usr/lib64/python2.7/collections.py", line 143, in pop raise KeyError(key) KeyError: '9a033dc6-97c1-432e-917c-0e7f7c499612' [E 161206 15:06:45 web:1971] 500 GET /api/tasks (10.40.30.136) 1.32ms

Problem is resolved after flower restart when all tasks are flushed.

My environment is:

$ celery -A backend report

software -> celery:4.0.0 (latentcall) kombu:4.0.0 py:2.7.9 billiard:3.5.0.2 redis:2.10.5 platform -> system:Linux arch:64bit, ELF imp:CPython loader -> celery.loaders.app.AppLoader settings -> transport:redis results:redis://:**@***

broker_url: u'redis://:****@*:6379/0' result_backend: u'redis://:****@*/0'

Photonios commented 7 years ago

I've debugged this issue and this is mostly due to the following scenario:

The actual bug is caused by Celery which broke task.as_dict() in the latest release. The method works fine as long as all information about a task is available. If not, it crashes with that error.

I found a work-around and am working on a PR to fix this. However, this should actually be fixed by Celery. Stay tuned.

This is a duplicate of #641.

uoxiu commented 7 years ago

I have the same problem after I started flower when many tasks have already been issued

celery==4.0.0 flower==0.9.1

I wait for an update, thank you very much ^^

cah-ricksuggs commented 7 years ago

I'm also experiencing the same issue

tramora commented 7 years ago

Duplicate of #626 @Photonios : is your proposed fix in celery 4.0.x is to catch a KeyError exception and return None for parent and root properties ?

(source celery/events/state.py, class Task)

jashandeep-sohi commented 7 years ago

any updates on this?

tramora commented 7 years ago

@jashandeep-sohi : No news yet on @Photonios side - who has thoroughly studied the issue. I've submitted a PR to the celery project. You can apply it as a Q&D fix https://github.com/celery/celery/pull/3950

jashandeep-sohi commented 7 years ago

Thanks @tramora, that's what I was looking for 👍