frazer-lab / cluster

Repo for cluster issues.
1 stars 0 forks source link

unexpected queueing on jobs #188

Closed djakubosky closed 6 years ago

djakubosky commented 7 years ago

Hi Paul, I launched a bunch of jobs to "short" and noticed that sometimes they instead launched to juplow. I just wanted to make you aware of this, I think they are running fine, but they are not notebooks

tatarsky commented 7 years ago

I am aware but unclear of the precise cause in SGE's brain. I will dig into it a bit more as I can.

djakubosky commented 7 years ago

Sounds good, just wanted to make sure it wasn't going to be a problem

tatarsky commented 7 years ago

I don't think it will be a problem but I need to figure it out. I've noted it a few times in SGE queues with close equivalent settings some times seem to "take" jobs they shouldn't. I've looked for root cause/bug report over the years but never figured out exactly why. You clearly defined -l short on your qsub.

djakubosky commented 7 years ago

Correct, I used -l short as usual On Mon, Apr 10, 2017 at 6:34 PM tatarsky notifications@github.com wrote:

I don't think it will be a problem but I need to figure it out. I've noted it a few times in SGE queues with close equivalent settings some times seem to "take" jobs they shouldn't. I've looked for root cause/bug report over the years but never figured out exactly why. You clearly defined -l short on your qsub.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/frazer-lab/cluster/issues/188#issuecomment-293124344, or mute the thread https://github.com/notifications/unsubscribe-auth/AOUvtstGHgJfB5DGhA2syvIFo6cxsYXhks5rutiQgaJpZM4M5enQ .

-- ____

David Jakubosky Biomedical Sciences Graduate Program Laboratory of Kelly A. Frazer, PhD Institute for Genomic Medicine, University of California at San Diego


tatarsky commented 7 years ago

I made one slight mod to see if its involved in the "seq_no" of the queue to make it slightly different than the short.q based on some mailing list "sort of similar". There is also a -hard option to qsub I've seen referenced but never seemed needed to truly enforce resource -> queue maps. (which is what we are doing using that -l short. I'll vary and monitor a bit to see if I can figure it out as config v.s. bug.

tatarsky commented 7 years ago

Noting only that my monitoring has not caught another instance of this since mod but I may need a scripted monitor as I don't do it every second. I will also see if qacct can provide same data.

tatarsky commented 7 years ago

Using qacct data I do not show similar since change to juplow seq_no. However, similar behavior seen via qacct data to juphigh due to possibly same seq_no there. Adjusted at time of comment juphigh seq_no to non-matching as short.q. Will review in a week.

djakubosky commented 7 years ago

I noticed some of my jobs going into juplow or juphigh (I can't remember which) again a few days ago, just wanted to report- again submitted with short queue

tatarsky commented 7 years ago

I did not see any juplow instances since the change. If you have a job id of juplow please let me know it.

But I did see juphigh.

I will check again, but job id would be helpful if you see it next time in case my qacct method of detecting it isn't working.

tatarsky commented 7 years ago

I am going to cron that check BTW so its not urgent if you don't catch it, unless my check is faulty ;)

tatarsky commented 7 years ago

Example what my entries look like:

qname        juphigh.q           
owner        djakubosky          
end_time     Wed Apr 19 02:12:42 2017
category     -l short=TRUE -pe smp 1

That is "pre-change from last night" and I made that change to juphigh based on seeing the above.

(clearly stated "short" but got run by juphigh....)

tatarsky commented 7 years ago

Last juplow I see remains the one I believe you started this ticket on:

qname        juplow.q            
owner        djakubosky          
end_time     Mon Apr 10 19:26:52 2017
category     -l short=TRUE -pe smp 1

Since then all juplow runs contain "juplow=TRUE"...again assuming my check is correct.

djakubosky commented 7 years ago

Ok if I see it again I'll note the job IDs sorry about that

tatarsky commented 7 years ago

No worries. Just trying to validate two things here:

  1. That my "change" is even a factor (I suspect at the end this remains an SGE bug....) . If we catch one in the wrong queue after last night, well, its not a fix.

  2. That my detection of jobs not going in right queue is working to make item 1 above possible ;)

tatarsky commented 7 years ago

OK. Noted a 5/1 "wrong queue" incident so its not my seq_no theory. Back to the source.

billgreenwald commented 7 years ago

Hi Paul.

Hiroko launched a bunch of jobs without specifying the queue in her job scripts, and a bunch of them (~50-100) ended up on both juphigh and juplow. It's an array job, jobid 2920087

Not a huge rush on this, but just wanted to let you know that this "queue hopping" happened again.

tatarsky commented 7 years ago

Yep. And I've not found a solution. I plan as part of the C7 efforts to build the latest version of SGE and see if this was a bug. I don't think its a config error (or if it is...I don't know where)

tatarsky commented 7 years ago

So as I wait for a ruling on moving a fast node to C7 (see #199) I downloaded the latest "Son of SGE" version 8.1.9 and am scanning some of its fixed bugs in the online Trac pages. I've not deployed it yet.

One item that is a long shot seemed to imply prior version may have been confused by boolean "true" v.s. "1" as a complex value (the way we control queue assignment)

So as a long shot I altered the juplow.q to state "juplow=true" compared to "juplow=1". Just in case.

I did not alter juphigh as a "control". I think this is unlikely but you never know.

djakubosky commented 7 years ago

Just wanted to note that about half of my jobs went to juphigh on a batch of jobs that had no specified queue

here are some of the job ids that went to juphigh: 2935839, 2935840, 2935841, 2935842, 2935843, 2935844, 2935845, 2935846, 2935847, 2935848, 2935849

while jobs with the same conditions went into the all queue. (example Job ID 2935834)

scripts for qsub located here: /frazer01/projects/CARDIPS/analysis/cardips-cnv-analysis/private_output/twin_concordance_VSQR_thresholds/unfiltered_2/INDEL_99/jobs

Example SGE header:

!/bin/bash

$ -N conc_T104_T103_INDEL_99

$ -pe smp 2

$ -V

$ -e /frazer01/projects/CARDIPS/analysis/cardips-cnv-analysis/private_output/twin_concordance_VSQR_thresholds/unfiltered_2/INDEL_99/jobs/out/conc_T104_T103_error

$ -o /frazer01/projects/CARDIPS/analysis/cardips-cnv-analysis/private_output/twin_concordance_VSQR_thresholds/unfiltered_2/INDEL_99/jobs/out/conc_T104_T103_out

tatarsky commented 7 years ago

Yeah folks, I get the message this happens. I don't need additional examples. I can get it all from qacct. Its not based on anything you are doing.

tatarsky commented 7 years ago

I am looking at simply removing the hub queues and using qsub priorities and runtime times in their submits. This has become too complex for what I wish to support and I'm going to simplify the queue count. (As I also don't know what the problem is). I will experiement and then most likely schedule a Hub shutdown/reconfig of the spawner.

tatarsky commented 7 years ago

Also be aware technically juphigh.q is "less restrictive" than all.q. Allowing up to 168 hours of runtime compared to 48. And having higher priority. So while annoying, its actually probably going to turn out to be some SGE backfill feature buried somewhere in the code. They do piles of that sort of thing and the knobs to control it are any area I've never fully grasped. But its still too many moving pieces for my tastes and I'd like to reduce the complexity of this small environment.

tatarsky commented 7 years ago

Here's my proposal based on the prior definitions of the juplow/juphigh queues.

juplow was defined as: All jobs have no wall time but priority was desired to get hubs running faster. I propose the use of the week.q and qsub -p 0 (currently all user jobs are -100 so thats a bump)

juphigh was defined as: All jobs have 1 week wall time. I propose simply submitting to week.q and the same use of the priority flag.

Basically juplow and juphigh would move into week.q and the qsub would continue to set the various h_vmem items. If a week runtime caused us too much trouble we'd look for other options but they would involve likely large hour expenditures.

tatarsky commented 7 years ago

And if the above is not acceptable, I am looking to see if there is any actual impact of a job being assigned to a juplow/juphigh queue. Aka a job that is given reduced runtime or priority as a result of that assignment. My current belief is in all cases "its a faster path". Annoying but I suspect a result of that.

djakubosky commented 7 years ago

That sounds fine to me, and to your earlier point, jobs seem to execute just fine in those queues so I don't think its a big deal either way.

tatarsky commented 7 years ago

I'm trying to prove the above statement 100%. I believe in all cases where a job is seen to hop its to a higher priority or less restrictive queue. Whether its a bug or feature I am not sure. I've certainly not told it to do this, but SGE does a bunch of stuff on its own by default in this space.....it just wants to get your job run.

But, if there is overlap in these queues, it would be nice to simplify for accounting/monitoring purposes. But there is no signs there is any user control over this "happening"

tatarsky commented 7 years ago

The case BTW I'm trying to find is a job scheduled for long.q getting placed into juphigh.q.

That would be a runtime degrade. (infinite -> 168 hours). Which would be ungood.

I can see long.q ending up in juplow.q but thats not a runtime degrade. Both are infinite.

All.q and short.q seem to be the primary items that hop. And they hop into less restrictive queues.

tatarsky commented 7 years ago

Actually I can't find a long.q hopping to either. So I'm suspecting this is a "feature"

tatarsky commented 7 years ago

As an experiment I am moving juphigh.q to have a runtime four hours longer than week.q and see if I see such animals hop again. I am holding on hub changes. I will also grep the source a bit with these concepts and see if I get lucky and find a comment on the topic.

tatarsky commented 7 years ago

So I've talked with a sysadmin friend at NIH with a long history with open source SGE and Univa (the commercial version of it). He has a theory which is consistent with what we see in the logs that this involves "PE" (parallel environments) and may indeed be a feature of the qmaster code. But can't prove it as he doesn't have open source SGE running in a similar config anymore.

He has a suggestion to try in which we make a "Jupyter Hub SMP PE" and assign it ONLY to those jup* queues. And change the qsub of the spawner to use it instead of the "smp" PE.

Right now all queues share the same "allow" list of PE environments (a default of smp mpi make)

That would require a hub restart to prove/deny. So advise if thats doable.

tatarsky commented 6 years ago

The item I mention above was configured on the new fl-hn2 hub and queues to see if it solves the matter.

tatarsky commented 6 years ago

Not seen with mods. One hub restart still needed to activate the "opt" protection.