Open im4bacon opened 3 years ago
Thanks for this issue! I'll try and set out some time to look into it in the next few weeks.
I'm not sure it's a race condition. The initial state should be "PENDING" or "STARTED" even if set_progress() isn't called.
The error message indicates this is caused by an unknown state (in which case this is related to #54). It would be helpful to know what state caused the error. Do you have a way to check the task state when the error happens? If not, I made PR #82 to log unknown states and show them in the "Uh-Oh" message.
What I saw was that if I created a ProgressRecorder and didn't set_progress until later, I could see that state was occasionally coming back as null and then within a few seconds I'd see it going to PENDING state.. But by that time, I already had the "uh-oh" message.
As soon as I added a set_progress just below my creation of the ProgressRecorder, I haven't had the problem since. progress_recorder = ProgressRecorder(self) progress_recorder.set_progress(0, 1, description='Querying solutions...')
That's odd. I'm unable to reproduce the issue with this setup: python 3.8, redis 5.0, packages:
The task:
@shared_task(bind=True)
def http_task(self, number):
progress_recorder = ProgressRecorder(self)
for i in range(number):
time.sleep(.1)
return random()*1000
The view:
def http_view(request):
result = http_task.apply_async(kwargs={'number':15})
return render(request, 'http.html', context={'task_id': result.task_id})
Is your application similar? Any other notable thing that could be related (celery config, progress bar template)? Also, where did you see that the state was null?
This doesn't happen often, and I just can't figure out how to reproduce it. But I added some console logs to hopefully give a little more info for you.
In celery_progress.js, in the function onData(), I'm falling into the case where data.complete === true and data.success === false. data.state does not equal RETRY, so I'm hitting the else onTaskError(). Even though the task did complete successfully.
{state: "PROGRESS", complete: true, success: false, progress: {…}, result: "Unknown state { <
data.progress looks like {pending: false, current: 0, total: 100, percent: 0}
{state: "PROGRESS", complete: true, success: false, progress: {…}, result: "Unknown state { <> }"
Interesting. I'm not sure how this response was made, but I have a theory.
This response is produced when a state is unknown. See here.
But 'PROGRESS' is a known state, and there's an "else if" clause that produces a proper response for it: https://github.com/czue/celery-progress/blob/4b1e023cbd72083e1eb542560c49d5653ceb5ee2/celery_progress/backend.py#L94
The only thing I can think of is that the state (self.result.state) changed midway. It was 'PROGRESS' at first, then it changed to an unknown state. So self.result.state had different values in these two lines: https://github.com/czue/celery-progress/blob/4b1e023cbd72083e1eb542560c49d5653ceb5ee2/celery_progress/backend.py#L63 https://github.com/czue/celery-progress/blob/4b1e023cbd72083e1eb542560c49d5653ceb5ee2/celery_progress/backend.py#L94
Is that possible? If it's, a fix would be to store the state once and use the stored value throughout the method instead of asking for it in each clause.
I'm also not sure what the unknown state is. It'll be logged and shown in the page if you use the unpublished version of celery-progress from github (by installing "git+https://github.com/czue/celery-progress@master#egg=celery-progress" or putting it in your requirements file).
If a ProgressRecorder is created, its state is not initialized until set_progress() is called. During this time, I think there is a race condition where a page calling initProgressBar() can get a bad state === null and fall into onTaskError() during this time. (progressBarMessageElement will say "Uh-Oh, something went wrong! Unknown state")