Open HenrikBengtsson opened 1 month ago
Scanning 8,115 jobs currently on the queue using:
for job in "${jobs[@]}"; do job_args=$(qstat -j "${job}" | grep -E "^job_args" | sed -E 's/job_args:[[:blank:]]+//'); if [[ -n ${j
ob_args} ]]; then printf "%s: %s\n" "${job}" "${job_args}"; fi; done
reveals only a few such mistakes, e.g.
...
2162260: h_rt=00:30:00
...
3613495: MSN,-l,h_rt=50:00:00
...
3624537: l,h_rt=00:30:00,mem_free=2G,gpu_mem=1G
...
3624546: l,h_rt=00:30:00,mem_free=2G,gpu_mem=1G
3624547: -l,h_rt=00:30:00,mem_free=2G,gpu_mem=1G
3624548: -l,h_rt=00:30:00,mem_free=2G,gpu_mem=1G
3624555: -l,h_rt=00:30:00,mem_free=2G,gpu_mem=1G
...
If
qstat -j <job>
shows:it suggests that an incorrect
qsub
call was made. A reproducible example (from 2024-10-09 Slack thread):I'm trying to reproduce this, and what I think happened is that you specified
-l ...
after the job script, e.g.but you need to specify it before, i.e.
The reason is that SGE/qsub stops parsing command-line options as soon as it reaches the job script argument (
script.sh
). Anything following, it will record (job_args
) and pass to the job script as-is, i.e. it will run your script as if you'd call it as manually:So, that's why
-l ...
is not used by SGE.