Closed cookpa closed 2 years ago
Tagging @jeffrey-phillips and @jeffduda who are also looking into this
I'm working on updating the docker image and all the python dependencies, including nipype. If that doesn't fix this we should look into messing around with DISPLAY
@cookpa could you try this with pennbbl/qsiprep:unstable
? Lots of updates, maybe this will work now
@mattcieslak unstable gives me an error. This is on data prepped with 0.14.3 - is unstable compatible with that?
labelconvert: uncompressing image "/tmp/qsirecon_wf/sub-999999_sample_recon/recon_wf/_dwi_file_..data..preprocess..sub-999999..ses-MR1..dwi..sub-999999_ses-MR1_acq-98dir_space-T1w_desc-preproc_dwi.nii.gz/get_atlases/schaefer100x7MNI_lps_to_dwi.nii.gz"... [==================================================]
labelconvert: Verifying parcellation image... [==================================================]
labelconvert: uncompressing image "/tmp/qsirecon_wf/sub-999999_sample_recon/recon_wf/_dwi_file_..data..preprocess..sub-999999..ses-MR1..dwi..sub-999999_ses-MR1_acq-98dir_space-T1w_desc-preproc_dwi.nii.gz/get_atlases/schaefer100x7MNI_lps_to_dwi.nii.gz"... [==================================================]
QSIPrep failed: Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
result["result"] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 521, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 639, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 750, in _run_command
raise NodeExecutionError(
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node create_src.
RuntimeError: subprocess exited with code 127.
Traceback (most recent call last):
File "/usr/local/miniconda/bin/qsiprep", line 8, in <module>
sys.exit(main())
File "/usr/local/miniconda/lib/python3.8/site-packages/qsiprep/cli/run.py", line 647, in main
qsiprep_wf.run(**plugin_settings)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/workflows.py", line 638, in run
runner.run(execgraph, updatehash=updatehash, config=self.config)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/base.py", line 166, in run
self._clean_queue(jobid, graph, result=result)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/base.py", line 244, in _clean_queue
raise RuntimeError("".join(result["traceback"]))
RuntimeError: Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
result["result"] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 521, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 639, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.8/site-packages/nipype/pipeline/engine/nodes.py", line 750, in _run_command
raise NodeExecutionError(
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node create_src.
RuntimeError: subprocess exited with code 127.
Since the xvfb part seems ok now I'm going to close this issue
The libQT fix made the recon workflow run again, and now the undead xvfb process is back.
Here's an interesting twist. If I replace "singularity run [options] qsiprep.sif [qsiprep args]" with "singularity exec [options] xvfb-run qsiprep [qsiprep args]", the Xvfb process still outlives the call to singularity, but does not block job termination.
I also looked at nipype a little bit. Perhaps calling this function before exiting would fix the issue?
I'm starting to suspect that the xvfb stuff is a result of calling the rendering code within a nipype SimpleInterface (which is python) and not through a CommandLine interface. For the CommandLine interfaces nipype should start and xvfb for the run and kill it after the run completes.
So I'm making a stand-alone program that does the plotting which will be xvfb-run by nipype. Also going to try to reduce the amount of memory required for this because it's way too much now
We've been having an issue where qsirecon jobs run indefinitely, even after qsiprep exits. I ran
ps
after callingsingularity run
, and it seems that an xvfb process persists after singularity exits. Example:Xvfb :1324577801 -screen 0 800x680x24 -nolisten tcp
This happens after running DSI-Studio
--recon_only
and recon specThe qsiprep version is 0.14.3 and the system is a Linux HPC (@mattcieslak it's PMACS, happy to share more details and a runnable example if it would help).
Others investigating this have found that this only happens with bsub jobs, and doesn't happen if the job is run interactively. Therefore I'm thinking something to do with DISPLAY might avoid the issue.
Possibly related to https://github.com/PennLINC/qsiprep/issues/195
Possibly also related, it seems others have had trouble keeping track of xvfb processes https://github.com/nipy/nipype/issues/1403