wtangiit / Qsim

event-driven job scheduling simulator for Cobalt
Other
5 stars 5 forks source link

NC: Prevent dead lock for hold-hold coscheduling scheme #25

Closed wtangiit closed 14 years ago

wtangiit commented 14 years ago

using "hold" scheme on both machine may cause dead lock. two way to prevent:

Spatial way (sys util restrict) Allow only x% of total nodes to be holden. job that may cause over holding will become yielding job.

Temporal way (temp yielding) To prevent deadlock, add a hold threshold, if a job hold more than THRESH hours, give up the resources for a scheduling iteration to allow other job to use the resource. (if no other job take the resource, it will hold again). This way will also benefit for the scenario that holding a sub-partition blocks a large job for a long time even without any backfilling.

wtangiit commented 14 years ago

9203d4fbf267b02bda09 1479aba3a48fbdc01afa 7f8ce01e3bce2e0245e2