Closed leigaoms closed 4 years ago
Totals | |
---|---|
Change from base Build 3578: | 0.0% |
Covered Lines: | 827 |
Relevant Lines: | 874 |
Have you tested these, or is behaviour will like these:
Have you tested these, or is behaviour will like these:
inference job being preempted by training jobs, training jobs squeeze resource used by inference to min_gpu and no more regular job can be scheduled(I'm not sure what will happen if at this time inference job owner increase min_gpu) Lei: Good point! I have added a constraint: User cannot change min_gpu if job is scheduling/running. If they need to change min_gpu, pause the job first to release all GPUs, then resume it. For already scheduling/running inference job, min_gpu cannot be preempted.
training job can preempt other training jobs with preemption enabled but not inference job with min_gpu allocated Lei: Yes
inference job can be allocate more if resource is available Lei: Yes, up to min(max_gpu, remaining cluster gpu)
inference job can not allocate more if other training jobs are also waiting resource Lei: Yes. training job is scheduled ahead of inference job
Refinement: