Weird parallel call to newsegment

AlexandreAbraham commented 9 years ago

I am preprocessing data using n_jobs=32. I have noticed something strange.

First, Among all SPM processes, only two of them are running:

pypreprocess1

When I look at the exact call, I realize that it is actually the same script that is launched 32 times:

pypreprocess2

Is that normal?

dohmatob commented 9 years ago

No this is weird and looks like a bug. But ... 32 processes and only 2 are running, are you sure the others are not zombies, etc. BTW, only the _do_subjects_dartel function executes the NewSegment node, the former itself being invoked by do_subjects_preproc. So it's not quiet possible for all those processes to have been spawned by a single same call to do_subjects_preproc. Could you please copy-paste the code block which invokes do_subjects_preproc(n_jobs=32, ...)

On Wed, Mar 25, 2015 at 6:16 PM, Alexandre Abraham <notifications@github.com

wrote:

I am preprocessing data using n_jobs=32. I have noticed something strange.

First, Among all SPM processes, only two of them are running:

[image: pypreprocess1] https://cloud.githubusercontent.com/assets/1647301/6830577/00ea1e18-d31b-11e4-9e08-fff8e1e153e5.png

When I look at the exact call, I realize that it is actually the same script that is launched 32 times:

[image: pypreprocess2] https://cloud.githubusercontent.com/assets/1647301/6830590/19cb1d88-d31b-11e4-88bf-f27c010fb2d9.png

Is that normal?

— Reply to this email directly or view it on GitHub https://github.com/neurospin/pypreprocess/issues/77.

DED

AlexandreAbraham commented 9 years ago

Here are the requested files! Yes I use dartel and n_jobs=32. BTW, despite of this weird behavior, newsegment ran fine and I am now at dartel step.

What I was thinking is that maybe these scripts were ran by mistake (because of the n_jobs) and that they were blocked by an IO somewhere.

preproc.py:

 standard imports
import sys
import os

# import API for preprocessing business
from pypreprocess.nipype_preproc_spm_utils import do_subjects_preproc

# file containing configuration for preprocessing the data
jobfile = os.path.join("preproc.ini")

# preprocess the data
results = do_subjects_preproc(jobfile, dataset_dir='.')

preproc.ini

######################################################################################
#
# pypreprocess configuration.
#
# Copy this file to the acquisition directory containing the data you wish to
# preprocess. Then, manually edit the values to customize the pipeline to suite your
# needs.
#
# Disable a preprocessing step by setting 'disable = True' under the corresponding
# section, or simply comment the section altogether.
#
# IMPORTANT NOTES
# ===============
# - indexing begins from 1 (matlab style)
# - you can explicitly specifiy the software to be used for a specific stage of the
#   preprocessing by accordingly setting the 'software' field under the
#   corresponding section (e.g like so: software = spm)
# - A value of 'auto', 'unspecified', 'none', etc. for a parameter means it should
# be specified or inferred at run-time
#
# Authored by DOHMATOB Elvis Dopgima <gmdopp@gmail.com> <elvis.dohmatob@inria.fr>
#
######################################################################################

[config]  # DON'T TOUCH THIS LINE !

##########
# INPUT
##########

# Path (relative or full) of directory containing data (if different from directory
# containing this configuration file).
dataset_dir = /storage/data/COBRE

# Brief description of dataset (you can use html formatting)
dataset_description = """
        <p>
          <a href="http://cobre.mrn.org/">The Center for Biomedical Research Excellence (COBRE)</a> is contributing raw anatomical and functional MR data from 72 patients with Schizophrenia and 75 healthy controls (ages ranging from 18 to 65 in each group). All subjects were screened and excluded if they had; history of neurological disorder, history of mental retardation, history of severe head trauma with more than 5 minutes loss of consciousness, history of substance abuse or dependence within the last 12 months. Diagnostic information was collected using the Structured Clinical Interview used for DSM Disorders (SCID).
        </p>
        <p>
          A multi-echo MPRAGE (MEMPR) sequence was used with the following parameters: TR/TE/TI = 2530/[1.64, 3.5, 5.36, 7.22, 9.08]/900 ms, flip angle = 7°, FOV = 256x256 mm, Slab thickness = 176 mm, Matrix = 256x256x176, Voxel size =1x1x1 mm, Number of echos = 5, Pixel bandwidth =650 Hz, Total scan time = 6 min. With 5 echoes, the TR, TI and time to encode partitions for the MEMPR are similar to that of a conventional MPRAGE, resulting in similar GM/WM/CSF contrast.
        </p>
        <p>
          Rest data was collected with single-shot full k-space echo-planar imaging (EPI) with ramp sampling correction using the intercomissural line (AC-PC) as a reference (TR: 2 s, TE: 29 ms, matrix size: 64x64, 32 slices, voxel size: 3x3x4 mm3).
        </p>
        <p>Slice Acquisition Order:<br>
          Rest scan - collected in the Axial plane - series ascending - multi slice mode - interleaved<br>
          MPRAGE - collected in the Sag plane - series interleaved - multi slice mode - single shot<br>
        </p>
        <p>
          The following data are released for every participant:
        </p>
        <ul>
          <li>Resting fMRI</li>
          <li>Anatomical MRI</li>
          <li>Phenotypic data for every participant including: gender, age, handedness and diagnostic information. </li>
        </ul>
        <p>
"""

# The name of the dataset as will be shown in the report pages. Must be an integer
# or auto
dataset_id = auto

# The number of subjects to include; by default all subjects are included.
nsubjects = auto

# List of (or wildcard for) subject id's to be ignored / excluded; must be space-
# separated list of subject ids.
exclude_these_subject_ids = None

# List of (or wildcard for) the only subjects to be included; must be space
# separated list of subject ids.
include_only_these_subject_ids = None

# Wildcard for, or space-separated list of, subject directories relative to the
# acquisition directory
subject_dirs = 0040*

# Path of session-wise functional images, relative to the subject data dir.
# Wildcards are allowed. Each session must be specified in the form
session_1_func = session_1/rest_1/rest.nii.gz

# Path of T1 (anat) image relative to the subject data dir
anat = session_1/anat_1/mprage.nii.gz

# Should caching (nipype, joblib, etc.) be used to safe ages of hard-earned computation ?
caching = True

# Number of jobs to be spawn altogether.
n_jobs = 32

# Should orientation meta-date be stripped-off image headers ?
deleteorient = False

############################
# Slice-Timing Correction
############################

# Don't you want us to do Slice-Timing Correction (STC) ?
disable_slice_timing = False

# Repetition Time
TR = 2.

# Formula for Acquisition Time for single brain volume.
TA = TR * (1 - 1 / nslices)

# Can be ascending, descending, or an explicitly specified sequence.
slice_order = ascending

# Were the EPI slices interleaved ?
interleaved = True

# Reference slice (indexing begins from 1)
refslice = 1

# software to use for Slice-Timing Correction
slice_timing_software = spm

####################################
# Realignment (Motion Correction)
####################################

# Don't do realignment / motion correction ?
disable_realign = False

# Register all volumes to the mean thereof ?
register_to_mean = True

# Reslice volumes ? 
realign_reslice = False

# Software to use realignment / motion correction. Can be spm or fsl
realign_software = spm

###################
# Coregistration
###################

# Don't you want us to do coregistration of T1 (anat) and fMRI (func) ?
disable_coregister = False

# During coregistration, do you want us to register func -> anat or anat -> func ?
coreg_func_to_anat = True

# Should we reslice files during coregistration ?
coregister_reslice = False

# Software to use for coregistration
coregister_software = spm

########################
# Tissue Segmentation
########################

# Don't you want us to segment the brain (into gray-matter, white matter, csf, etc.) ?
disable_segment = False

# Software to use for tissue segmentation.
segment_software = spm

# Use spm's NewSegment ?
newsegment = True

##################
# Normalization
##################

# Don't you want want us to normalize each subject's brain unto a template (MNI
# for example) ?
disable_normalize = False

# Path to your template image.
template = "MNI"

# Voxel sizes of final func images
func_write_voxel_sizes = [3, 3, 4]

# Voxel sizes of final anat images
anat_write_voxel_size = [1, 1, 1]

# Use dartel for normalization ?
dartel = True

# Software to use for normalization.
normalize_software = spm

##############
# Smoothing
##############

# FWHM (in mm) of smoothing kernel.
fwhm = 0 # [5, 5, 5]

###########
# Output
###########

# Root directory (full path or relative to the directory containing this file) for
# all output files and reports
output_dir = ./pypreprocess_output

# Generate html reports ?
report = True

# Plot coefficient of variation post-preprocessing ?
plot_cv_tv = True

mrahim commented 8 years ago

I don't know why did we close this ... I'm experiencing the same behavior when I run NewSegment in do_subjects_preproc. The reason is that all subjects anat are concatenated as channel_files and processed once. I'm wondering if we can parrallelize NewSegment by taking one anat per channel_files.

neurospin / pypreprocess

Weird parallel call to newsegment #77