Open sportsracer opened 7 years ago
I propose a Docker Image like versioned state + step identification approach.
After dependencies calculation i'll arrive to a certain execution order of the tasks. Let's consider it is the following, and i identify every step with the following notation:
While running this sequence i could store the ids (or a hashed version of them) of all run steps, more than this i could also store the internal state of the graph at every step by serializing it with pickle and identifying it with the id.
So let's assume that task C is bad implemented.
OBS: to test always the same chain the "calculated" order of tasks has to be unique or forced in some way using the history. OBS2: it's a docker-like approach, docker images generate an hash for every step run, in this way the storage of successive buildings includes only the binary delta from the previously unmodified script. OBS3: obviously, when i find an "undone" task i create a new history removing all the steps that are after the first undone in the older history (loaded in memory).
Pro: It allows to start again from any point and to store data incrementally Cons: It consumes a lot of space for the versioning of the graph.
Sometimes, running many tasks takes a long time. If the graph files when it's almost done, you currently need to rerun everything.
Solution: When execution of a graph files, serialize the state and data of the task graph. Then, resume execution from that point. Note: You should be able to change the code of tasks in between failure and retry. Since bugs are many times the cause of task failure.
This needs to be well thought through wrt multiprocessing and sharing of data.