Closed deschsimon closed 5 years ago
Sorry, no expertise. If you want to troubleshoot this yourself, you could try excluding flameo
from running in parallel. If you look at the version of fslsub distributed here it excludes any process with the _gpu
. You could add a similar exclusion for flameo. It is possible that FSL 6.0 includes an updated fslsub that has other features not in the version distributed here. Alternatively, it is possible that the new version of flemeo includes modifications (e.g. openmp) that would make you want to exclude it from parallel computations.
Regardless, for a high-end system like you are using, you may want to consider investing the time in a conventional approach to using FSL in parallel, e.g. SGE or SLURM. I use the fslsub I distribute here on my laptop to test fsl scripts, but use SLURM for the heavy lifting on the campus supercomputers.
if [ $numCores -gt 1 ] ; then #disable parallel processing for flameo
line=`sed -n -e ''1'p' $taskfile`
key="flameo"
if [ "${line#*$key}" != "$line" ] ; then
numCores=1
echo "Only running single thread: command includes $key" >&2
fi
fi
Thanks for the quick reply!
You're right, in the long run we'll definetly go for a Parallel Engine! We've just upgraded hardware and set up the system now. So I thought before I will find the time to setup the Engine I could provide an easy interim solution.
@deschsimon - I agree that SGE/SLURM would be ideal for your new system. However, for other users it would be great if you could trouble shoot this and tell us if this is an incompatibility with my fsl_sub and FSL 6.0 or simply an issue of running two many copies of flameo on your computer (e.g. exhausting RAM). You could test this by including this line in your shell startup script (or your fsl.sh/fsl.csh) FSLPARALLEL=4
- this command will limit my fsl_sub to only use 4 threads, rather than all the ones available on your computer. If it works on your computer, it suggests that my fsl_sub is compatible with FSL 6.0, but you need to make sure you do not run too many jobs concurrently.
Hi!
I have succesfully used your script on my MacbookPro and do like it!
I have now tried to use it on a linux workstation (LinuxMint 19). On this workstation I have 2 versions of FSL installed:
/usr/share/fsl/5.0
(FSL version 5.0.11)/usr/share/fsl/6.0
(FSL version 6.0) In both I've replacedfsl_sub
by your script. Running the example you provide works fine and shows substantial decrease in duration as expected.However, if I run a
feat
-analysis (which uses FLAME, and thus, callsfsl_sub
) using FSL 6.0feat
gets stuck when performingflameo
.htop
shows manyflameo
processes running in parallel. These processes keep running even when I killfeat
. All CPUs run at 100%. The only way I can stop this is to kill allflameo
processes of the respective user. This does not happen using FSL 5.0.11.Any ideas on this are highly appreciated! Thank you!
System information: