PennLINC / babs

BIDS App Bootstrap (BABS)
https://pennlinc-babs.readthedocs.io
MIT License
5 stars 5 forks source link

Make parsing of output more robust #160

Open yarikoptic opened 10 months ago

yarikoptic commented 10 months ago

ATM whenever something doesn't go right in SLURM commands, babs would crash with smth like

    df_all_job_status = request_all_job_status(self.type_system)
  File "/home/asmacdo/devel/babs/babs/utils.py", line 1896, in request_all_job_status
    return _request_all_job_status_slurm()
  File "/home/asmacdo/devel/babs/babs/utils.py", line 1937, in _request_all_job_status_slurm
    squeue_out_df = _parsing_squeue_out(std)
  File "/home/asmacdo/devel/babs/babs/utils.py", line 1977, in _parsing_squeue_out
    raise Exception("error in the `squeue` output,"
Exception: error in the `squeue` output, expected jobid and got squeue:

which leaves you guessing what is happening. In this case it was due to us lacking the original user inside the slurm podman container

[root@slurmctl /]# squeue -u blah
squeue: error: Invalid user: blah

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
[root@slurmctl /]# squeue -u blah 2>/dev/null
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
[root@slurmctl /]# echo $?
0

so it is coming to stderr not stdout which is what we want to parse. I think it would make more sense to

@asmacdo might want to prep a quick PR