xenon-middleware / xenon-cli

Perform files and jobs operations with Xenon library from command line
http://nlesc.github.io/Xenon/
Apache License 2.0
2 stars 3 forks source link

Add option to set memory requirements per job #58

Closed arnikz closed 6 years ago

arnikz commented 6 years ago

In addition to --max-run-time, one would like to set memory requirements (MB): SGE: -l h_vmem SLURM: --mem

sverhoeven commented 6 years ago

Must be implemented in Xenon first, see https://github.com/NLeSC/Xenon/issues/562

arnikz commented 6 years ago

One of my jobs keeps failing on SGE due to memory requirements. From the log it seems that xenon passes mem_free instead of h_vmem parameter (as suggested above). The job used about 10G (maxvmem) but was cancelled after ~3h on a node with 32G free memory. Why?

...
qsub_time    Mon Mar  5 16:27:04 2018
start_time   Mon Mar  5 17:43:35 2018
end_time     Mon Mar  5 20:28:45 2018
granted_pe   threaded            
slots        1                   
failed       37  : qmaster enforced h_rt, h_cpu, or h_vmem limit
exit_status  137                  (Killed)
ru_wallclock 9910s
ru_utime     0.081s
ru_stime     0.121s
ru_maxrss    2.246KB
ru_ixrss     0.000B
ru_ismrss    0.000B
ru_idrss     0.000B
ru_isrss     0.000B
ru_minflt    52580               
ru_majflt    0                   
ru_nswap     0                   
ru_inblock   8                   
ru_oublock   176                 
ru_msgsnd    0                   
ru_msgrcv    0                   
ru_nsignals  0                   
ru_nvcsw     680                 
ru_nivcsw    88                  
cpu          15831.750s
mem          32.009KGBs
io           875.730GB
iow          0.000s
maxvmem      10.009GB
arid         undefined
ar_sub_time  undefined
category     -l h_rt=0,mem_free=32768M -pe threaded 1 -P compgen

It's weird that another call to the accounting system shows a different category line: -l h_rt=172800... for the same job.

arnikz commented 6 years ago

After discussion with @jmaassen we found that both mem_free and h_vmem must be set to the same value.

arnikz commented 6 years ago

Fixed now in Xenon v2.6