Closed jhgoh closed 6 months ago
계산노드가 drain상태로 잡을 받을 수 없는 상태.
$ sinfo -N -l Fri May 24 11:54:03 2024 NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON entei 1 normal* drained 128 128:1:1 512000 0 1 (null) Kill task failed ho-oh 1 normal* allocated 64 64:1:1 256000 0 1 (null) none lapras 1 gpu1 mixed 128 128:1:1 256000 0 1 (null) none mewtwo 1 gpu2 mixed 12 12:1:1 128000 0 1 (null) none raikou 1 normal* drained 128 128:1:1 512000 0 1 (null) Kill task failed suicune 1 normal* drained 128 128:1:1 512000 0 1 (null) Kill task failed
노드 상태를 release함.
scontrol update nodename='entei,raikou,suicune' state=resume sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST normal* up infinite 4 alloc entei,ho-oh,raikou,suicune gpu1 up infinite 1 mix lapras gpu2 up infinite 1 mix mewtwo
@slowmoyang
계산노드가 drain상태로 잡을 받을 수 없는 상태.
노드 상태를 release함.
@slowmoyang