Open turicas opened 11 years ago
Very important. Then it would be very nice to create a simple decorator people could use to submit jobs, as many distributed processing libraries have: People could decorate functions they want to offload to a pipelinin queue for distributed processing. I have something similar in my Liveplots package: https://github.com/fccoelho/liveplots/blob/master/src/liveplots/xmlrpcserver.py
def enqueue(f):
"""Decorator that places the call on a queue"""
def queued(self,*args,**kw):
Q.put((f,(self,)+args))
queued.__doc__ = f.__doc__
queued.__name__ = f.__name__
return queued
where Q is a Queue object.
Also a customized map/reduce (this could be lifted from other python map/reduce libraries) framework would see great adoption.
The initial idea for this issue is just to implement something that I can submit a Job
object instead of an entire Pipeline
object (which is a group of Job
objects). So, for the first implementation, the worker should exist on brokers to this approach work properly.
In a next release, maybe, we can think of remote code execution (we need to be more careful if implement this). I think it's needed but not for now -- @fccoelho, can you please create an issue describing this feature?
Done!
For some kinds of processing we don't need to submit complete pipelines to the cluster. Currently it's not possible unless we use a hack (create a pipeline with two jobs: the one you want to be executed and a dummy one). We just need something like a
JobManager
class, likePipelineManager
, but simpler.