PennLINC / babs

BIDS App Bootstrap (BABS)
https://pennlinc-babs.readthedocs.io
MIT License
5 stars 5 forks source link

[FIX] resubmitting jobs that are in queue: use `scancel` to delete the job in queue on Slurm clusters #133

Closed zhao-cy closed 1 year ago

zhao-cy commented 1 year ago

Currently, BABS always uses qdel to delete the job, and qdel is an SGE command. Depending on the cluster system, for Slurm clusters, it should be scancel. Otherwise, it would cause error when babs-status --resubmit(-job) a pending job.

zhao-cy commented 1 year ago

Hi @djarecka @yibeichan ! On MIT or Princeton Slurm cluster, when you hope to cancel a pending job, would this command work? scancel <job_id>?

I know it works on UMN MSI Slurm cluster, just hope to confirm that it also applies to other Slurm clusters.

Thanks, Chenying

yibeichan commented 1 year ago

hi, so on princeton slurm scancel <job_id> it works. after scancel, babs-status will show that this job failed. I'll test it on openmind cluster soon.

yibeichan commented 1 year ago

hi, scancel <job_id> works on openmind cluster too! all good

zhao-cy commented 1 year ago

Perfect, thank you so much!