cBio / cbio-cluster

MSKCC cBio cluster documentation
12 stars 2 forks source link

Active/Total proces count low #412

Closed jchodera closed 8 years ago

jchodera commented 8 years ago

This may be more normal than it seems, but I am noticing that showstats shows an unusually low fraction of thread-slots in use despite a large queue backlog:

Current Active/Total Procs:      1239/3380   (36.657%)

Could this indicate that many jobs are demanding a very large amount of memory and therefore can't get scheduled?

Here's the full showstats output:

[chodera@mskcc-ln1 ~]$ showstats

moab active for   10:12:45:44  stats initialized on Tue Mar  8 12:16:45 2016

Eligible/Idle Jobs:              2252/2252   (100.000%)
Active Jobs:                      438
Successful/Completed Jobs:     312517/312517 (100.000%)
Avg/Max QTime (Hours):           9.25/351.68
Avg/Max XFactor:                 0.16/704.36

Dedicated/Total ProcHours:      1.69M/5.07M  (33.371%)

Current Active/Total Procs:      1239/3380   (36.657%)

Avg WallClock Accuracy:          15.300%
Avg Job Proc Efficiency:         66.884%
Est/Avg Backlog:                12:42:52/1:12:45:11 

We supposedly have something like 2141 idle threadslots.

tatarsky commented 8 years ago

The idle threadslots above represent the Fuch/Sbio nodes. Which are being added following approval of doing so.

mdiag -n shows per node stats.

jchodera commented 8 years ago

Thanks! Hadn't realized this was the case until @juanperin mentioned this in person.