NAMD / pypelinin

Python library to distribute jobs and pipelines among a cluster
3 stars 5 forks source link

Clients should be able to submit jobs (not only pipelines) #36

Open turicas opened 11 years ago

turicas commented 11 years ago

For some kinds of processing we don't need to submit complete pipelines to the cluster. Currently it's not possible unless we use a hack (create a pipeline with two jobs: the one you want to be executed and a dummy one). We just need something like a JobManager class, like PipelineManager, but simpler.

fccoelho commented 11 years ago

Very important. Then it would be very nice to create a simple decorator people could use to submit jobs, as many distributed processing libraries have: People could decorate functions they want to offload to a pipelinin queue for distributed processing. I have something similar in my Liveplots package: https://github.com/fccoelho/liveplots/blob/master/src/liveplots/xmlrpcserver.py

def enqueue(f):
    """Decorator that places the call on a queue"""
    def queued(self,*args,**kw):
        Q.put((f,(self,)+args))
    queued.__doc__ = f.__doc__
    queued.__name__ = f.__name__
    return queued

where Q is a Queue object.

Also a customized map/reduce (this could be lifted from other python map/reduce libraries) framework would see great adoption.

turicas commented 11 years ago

The initial idea for this issue is just to implement something that I can submit a Job object instead of an entire Pipeline object (which is a group of Job objects). So, for the first implementation, the worker should exist on brokers to this approach work properly. In a next release, maybe, we can think of remote code execution (we need to be more careful if implement this). I think it's needed but not for now -- @fccoelho, can you please create an issue describing this feature?

fccoelho commented 11 years ago

Done!