Closed 0xDEC0DE closed 5 years ago
Hi, thank you for the detailed bug report, it's super helpful.
The worker quietly swallows the exception and reschedules the task and only logs the last error.
Definitely not providing an helpful message until the last failure is wrong. This will be easy to fix.
Ideally, the job is aborted, since it's been scheduled incorrectly and can never start.
There is an edge case to keep in mind: two different versions of an application may run simultaneously on the same queue.
compute(a)
compute(a, b)
Retrying a job instead of canceling it gives a chance that it lands on a worker that can process it successfully (the developer of the app should instead use different tasks or different queues if tasks are not compatible).
A similar issue exists when a worker receives a task that it does not know about. I decided to retry it for the same reason.
I will check if it's possible in Python to check if a set of (*args, **kwargs)
are compatible with a function without actually executing it. This could be added on the client before scheduling the task and would solve some of the problems.
I'm left with the impression this is what one is supposed to use:
https://docs.python.org/3/library/inspect.html#introspecting-callables-with-the-signature-object
So something like this in spinach.worker.Workers._worker_func
:
from inspect import signature
try:
signature(job.task_func).bind(*job.task_args, **job.task_kwargs)
except TypeError:
do_something_clever()
except ValueError:
# can't inspect the callable, good luck to you!
pass
...might do the trick.
@0xDEC0DE I pushed a branch that should solve your problem. It does so by checking that arguments are compatible with a task before scheduling, not in the worker like you suggested.
If you still want to be able to cancel a non-compatible job in workers I am thinking to add a signal that could be used to override the default retrying behavior. Let me know if that's necessary or if this commit is enough.
This makes a terrific amount of sense, and seems like it should meet our needs almost exactly -- but we'll put it through it's paces and see for sure.
Update: indeed, this branch behaves exactly like we would expect it to when given bad inputs. Excellent work!
The sooner this makes it into a release, the happier we'll be.
Great, version 0.0.11
has been released.
Version
Spinach 0.0.10
Steps to Reproduce
A toy test case, note the missing required params in the
schedule
call:Expected Result
Ideally, the job is aborted, since it's been scheduled incorrectly and can never start, much less finish successfully. Failing that, log messages stating what went wrong in between retries would be useful
Actual Result
The worker quietly swallows the exception and reschedules the task and only logs the last error:
This takes a LONG time, provides no useful context until the retry limit has been hit, clogs up the task queue with garbage, and only gets worse the higher
max_retries
is.Workaround
Redefining all task functions with
kwargs
instead ofargs
and having the task check/raise aspinach.task.AbortException
in the event a bad invocation largely dodges this, at the cost of a lot of boilerplate checking code in the tasks.