georgymh / decentralized-ml

Interoperable and decentralized machine learning.
Apache License 2.0
9 stars 5 forks source link

Add DMLResult and incorporate it to the Runner->Scheduler error flow #23

Closed georgymh closed 6 years ago

georgymh commented 6 years ago

This PR introduces a DMLResult class. DMLResult objects are created inside the Runners once a DMLJob is executed. A DMLResult contains metadata of the job (status, error message, job type) as well as the results of the job.

This PR also makes the Runners return the DMLResults to the Scheduler, which ignores them if the jobs were successful or does retry-logic if the jobs failed. As a side note, the Scheduler's retry-logic seemed to have a bug and as an attempt to fix it I broke up the current_jobs property into two lists: one for DMLJob objects and another for asynchronous DMLResult objects (asynchronous because these are promise objects from Python's pool library).

The next PR will incorporate the DMLResult class to the Communication Manager for the success flow of a job.