pgiri / dispy

Distributed and Parallel Computing Framework with / for Python
https://dispy.org
Other
266 stars 55 forks source link

ProvisionalResults overwriting mutable job instances in worker_Q #175

Open JamesMaroney opened 5 years ago

JamesMaroney commented 5 years ago

I've stumbled across a little bit of an issue with how the _Cluster class adds items to its worker_Q.

Around line: https://github.com/pgiri/dispy/blob/master/py3/dispy/__init__.py#L1851 the job object reference is mutated with the result/status/etc and is put into the Queue. The issue arises when additional results for the same job (i.e. ProvisionalResults in my case) arrive before the consumer can process the one or ones already in the Queue. In this case, the job instance is shared, and the entries in the queue are (I assume) accidentally mutated.

A similar issue may arise here, but I am less certain about this code path: https://github.com/pgiri/dispy/blob/master/py3/dispy/__init__.py#L1779

What I've done locally to "fix" this, is added a .clone() method to the DispyJob class, as below, and instead mutate/enqueue a clone of the job instance. I wouldn't be surprised if there is a better approach, but here is my implementation of the clone method in case it is helpful at all:

    def clone(self):
        _clone = DispyJob(self._args, self._kwargs)
        _clone.id = self.id
        _clone.result = self.result
        _clone.stdout = self.stdout
        _clone.stderr = self.stderr
        _clone.exception = self.exception
        _clone.submit_time = self.submit_time
        _clone.start_time = self.start_time
        _clone.end_time = self.end_time
        _clone.status = self.status
        _clone.ip_addr = self.ip_addr
        _clone.finish = self.finish
        _clone._dispy_job_ = self._dispy_job_
        return _clone

Thanks you guys for creating this awesome tool! I've really been amazed at what dispy can do πŸ‘

pgiri commented 5 years ago

I believe fix committed should handle this case. Please test and confirm so new version can be released.

Thank you for identifying the cause and potential fix, which made it easy to understand the problem.

JamesMaroney commented 5 years ago

πŸ‘ Your fix seems to work perfectly for me. Thanks for the super fast turn-around!