georgemarselis / openlava-macosx

Automatically exported from code.google.com/p/openlava-macosx
GNU General Public License v2.0
1 stars 0 forks source link

nodes closed after exclusive job not being freed #229

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Dear experts,

My nodes keep closed after I have submitted jobs to a exclusive queue.

[root@geomechanics fcanesin]# bjobs -u all
No unfinished job found
[root@geomechanics fcanesin]# bhosts
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
geomechanics       closed          -      0      0      0      0      0      0

[root@geomechanics fcanesin]# bhosts -l n00
HOST  n00
STATUS           CPUF  JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV 
DISPATCH_WINDOW
closed_Excl      1.00     -     32      0      0      0      0      0      -

 CURRENT LOAD USED FOR SCHEDULING:
              r15s   r1m  r15m    ut    pg    io   ls    it   tmp   swp   mem
 Total         0.0   0.0   0.0    0%   0.0     0    0    64   62G    0M  125G
 Reserved      0.0   0.0   0.0    0%   0.0     0    0     0    0M    0M    0M

 LOAD THRESHOLD USED FOR SCHEDULING:
           r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
 loadSched   -     -     -     -       -     -    -     -     -      -      -
 loadStop    -     -     -     -       -     -    -     -     -      -      -

I have tried to do:
badmin hclose all
badmin hopen all
lsadmin ckconfig shows no errors.
Restart the hole cluster don't help.
restart the daemons don't help also.

Any ideas ??

https://mail.google.com/mail/u/0/#inbox/13fbe9b3e36a081c

Original issue reported on code.google.com by geo...@marsel.is on 18 Jul 2013 at 6:54