cBio / cbio-cluster

MSKCC cBio cluster documentation
12 stars 2 forks source link

Jobs in the Q status for unusually long time #387

Closed vkuryavyi closed 8 years ago

vkuryavyi commented 8 years ago

I have submitted 40 jobs to hal cluster this morning (03.14). Usually, they all quickly pass the 'Q' status. This time, I see only 8 jobs running at a time while all others are in the Q. I see a total of 173 jobs running on the cluster.

tatarsky commented 8 years ago

Have you run checkjob jobid to see why? I show the cluster is very short on ram on the nodes due to other jobs. checkjob tells you that.

For example if you look at:

checkjob -v -v -v 7034597

You will see many nodes rejecting your job due to memory allocation. Another user has considerable jobs with 40GB per....

tatarsky commented 8 years ago

mdiag -n is also helpful in that it shows "what is left" which isn't much in memory land today.

The first number in the "memory" column is remaining memory of a node.

tatarsky commented 8 years ago

I can confirm all your queued jobs are waiting for enough ram on a node to run. It is VERY memory tight out there.

vkuryavyi commented 8 years ago

I see. Thank you @tatarsky

tatarsky commented 8 years ago

Some useful items I will turn into an expanded FAQ on the topic from checkjob -v

rejected: Memory Not enough memory on the node to fit your job rejected: State (Drained) Node is out of the pool (we have one node down right now) rejected: State (Busy) All processor slots consumed rejected: Features You've asked for something the node doesn't have. In most cases now its a node not in the "batch" queue as we have two new groups with dedicated queues.

vkuryavyi commented 8 years ago

When submitting, I saw 'low priority job' allocation message. What is under my control for job priority, and how do I get a quote for dedicated queues?

tatarsky commented 8 years ago

Where did you see this message? I'm not familiar with any qsub output of that nature. Your items are in the batch queue at the priority that is normal for that queue.

As for "a quote for dedicated queues"? As in some kind of paying extra for such a thing?
You'd have to speak to @juanperin for that as no such concept exists to my knowledge.

vkuryavyi commented 8 years ago

I saw this message at the job submission interface of Maestro GUI.

'a quote for dedicated queues' as in your message, above "as we have two new groups with dedicated queues"

what concept, of dedicated queues?

vkuryavyi commented 8 years ago

the exact message at Maestro GUI was "running at reduced cpu priority"

tatarsky commented 8 years ago

I don't know anything about Maestro but I show your jobs submitted normally.

I assume it wraps things in qsub and I show the arguments "normal":

Submit Args:    -q batch -l walltime=72:00:00 -l mem=4G -o /dev/null -S /bin/sh -

So I don't know what that means. Your jobs are submitted at what I consider the normal levels.

While there is a "-p" flag for altering some level of priority its not really going to solve the fact there isn't ram to run them. It may move you up slightly in the queued list but you'd have to experiment with that.

The users with dedicated queues bought nodes and hardware for said queues. If you wish to purchase nodes to add to the cluster speak with @juanperin

tatarsky commented 8 years ago

showq -i BTW is a fairly good view of the waiting jobs and the ranking which is based for batch I believe mostly on fairshare.

tatarsky commented 8 years ago

I will also mention if you are wanting to squeak a few through, if you request 3GB instead of 4GB I believe you would get several nodes that have just > 3GB free. But I don't know your memory requirements and clearly state that is getting a bit "tetris" in terms of effort.

tatarsky commented 8 years ago

I show your jobs all have run. Some remain running.

vkuryavyi commented 8 years ago

Thank you @tatarsky for useful recommendations. In many cases 3GB for my jobs will do as well as 4GB, I'll change that. I guess, wall time is of lesser importance for a job to squeeze in?

tatarsky commented 8 years ago

I do not believe so for batch but will check. The main factors in getting scheduled is either processor slot or memory IIRC. If Torque/Moab can find a place to put you it will.

tatarsky commented 8 years ago

Per this rather long series and a review of the moab config the weight assigned to resource requests (RES) in queue priority is small (1) compared to much larger priorities assigned by fairshare (100).

http://www.adaptivecomputing.com/blog-hpc/using-moab-job-priorities-creating-prioritization-strategy/

And in fact I show walltimeweight is zero. So its completely not a factor in queue priority. Requested memory is of course a factor in that a node has to have the memory required to run the job.

As I work on some review of the moab weights and priorities I make notes of such things as this Git to discuss perhaps other options in tuning queue wait priorities.

But for now I consider this matter closed as the issue was just that the scheduler was doing what it could but as per its rules and consumed resources your jobs had to wait for memory.