Closed oximi123 closed 7 months ago
Thanks for stopping by! I have adopted the term "commitment" from the original Decima implementation. We say that an executor is committed to a stage if there is a plan to attach it to the stage in the future. In the current moment, the executor might be busy or the stage's dependencies may not all be complete, so the executor cannot be attached immediately - but it can be committed. Once the attachment becomes possible, the commitment gets realized, without having to consult the scheduler again. The scheduler's decisions are commitments - it first decides on a stage, then decides how many executors to commit to that stage.
Not all commitments get fulfilled. For example, imagine that an executor which was previously committed to some stage becomes available, and suppose that the stage has already completed. That executor is no longer needed for that stage, so the environment tries to find a backup stage.
Thanks for your reply. I guess I've understood this process.
It's really a wonderful work! May I ask what's the meaning of the term "commitment" ? I see lots of expressions related to it in executor_tracker.py. I am new to Spark and dag scheduling.