nipy / nipype

Workflows and interfaces for neuroimaging packages
https://nipype.readthedocs.org/en/latest/
Other
747 stars 530 forks source link

SpecifySPMModel saves files of zero size #3303

Open JohannesWiesner opened 3 years ago

JohannesWiesner commented 3 years ago

Summary

This issue is possibly related to #3301. All problems discussed there were bypassed by tweaking the script so that it doesn't produce the error anymore (specifically, it was discussed that SpecifySPMModel cannot handle bids_event_files). The following script bypasses this by using subject_info instead of bids_event_files. The script now works when running on a Linux-Server via Singularity but not when using Docker + WSL2 on Windows 10 (files are saved on the local hard drive, not in the WSL-2 distro (Ubuntu 18.04 in my case), maybe this causes the problem?).

Actual behavior

On my windows PC: The scripts runs through 'successfully' without reporting any errors. However, when it reaches the SpecifySPMModel node, this folder contains files of no size (0KB). These files of no size apparently still seem get passed over to to the next node EstimateModel which also runs through without reporting any errors but outputs no beta-images.

On Linux: on_linux

On Windows: on_windows

Script/Workflow details

This is the now tweaked code that works on the Linux-Server but not on my personal Windows 10 machine:

#!/usr/bin/env python
# coding: utf-8

# First Level Analysis Workflow for one HCP functional MRI task

## Import necessary modules
import pandas as pd
from nipype.interfaces.utility import Rename
from nipype.interfaces.base import Bunch
from nipype.interfaces.io import SelectFiles, DataSink
from nipype import Workflow,Node,MapNode
from nipype.interfaces.utility import Function, IdentityInterface
from nipype.algorithms.misc import Gunzip
from os.path import join as opj

from nipype.interfaces.spm import Smooth,Level1Design, EstimateModel, EstimateContrast
from nipype.algorithms.modelgen import SpecifySPMModel

## Global specific settings that apply to all HCP first level analyses

# define a list of subjects
subject_list = ['952863']

# we need to concatenate both the LR and the RL phase encoding files. So we set up a list
# to iterate over both the two functional iamges and their corresponding /EV folders where the task information is stored
phase_encodings = ['LR','RL']

# set up a smoothing kernel
smoothing_fwhm = [4,4,4]

## Task specific settings

# set task specific data directory
# NOTE: This has to end with a slash!
data_dir = '/data/'

# set up the names for all different event files
events_txt_list = ['0bk_faces.txt','0bk_places.txt','0bk_body.txt','0bk_tools.txt',
                   '2bk_faces.txt','2bk_places.txt','2bk_body.txt','2bk_tools.txt']

# provide a list of names for all different event files
events_names = ['0bk_faces','0bk_places','0bk_body','0bk_tools',
                '2bk_faces','2bk_places','2bk_body','2bk_tools']

# set a name for this specific workflow
workflow_name = 'nback_first_level_analysis'

# Data Preparation

## Iterate over each subject

# create a node that iterates over the subject(s)
subject_iterator = Node(IdentityInterface(fields=["subject_id"]),name="subject_iterator")
subject_iterator.iterables = [("subject_id",subject_list)]

## Select the LR and the RL files 

# create a MapNode in order to select either a LR or RL file for a subject
templates = {'func':'{subject_id}/MNINonLinear/Results/tfMRI_WM_{phase_encoding}/tfMRI_WM_{phase_encoding}.nii.gz'}

run_selector = MapNode(SelectFiles(templates,base_directory=data_dir),
                        iterfield=['phase_encoding'],
                        name='run_selector')

run_selector.inputs.phase_encoding = phase_encodings

## Unzip the LR and RL files

# create a node that unzips input files (this is necessary for SPM to run)
gunzipper = MapNode(Gunzip(),iterfield=['in_file'],name='gunzipper')

## Standardize the LR and the RL files
# we want to standardize each image individually
def standardize_img(in_file,subject_id,phase_encoding):

    from nilearn.image import clean_img
    import os

    in_file_standardized = clean_img(imgs=in_file,
                                     detrend=False,
                                     standardize=True)

    in_file_standardized.to_filename(f"sub-{subject_id}_tfMRI_WM_{phase_encoding}.nii")
    out_file = os.path.abspath(f"sub-{subject_id}_tfMRI_WM_{phase_encoding}.nii")

    return out_file

standardizer = MapNode(Function(input_names=['in_file','subject_id','phase_encoding'],
                                output_names=['out_file'],
                                function=standardize_img),
                       iterfield=['in_file','phase_encoding'],
                       name='standardizer')

standardizer.inputs.phase_encoding = phase_encodings

## Smooth the LR and RL files using SPM

# we want to smooth each image before passing it to the SPM model
smooth = MapNode(Smooth(),iterfield=['in_files'],name='smooth')
smooth.inputs.fwhm = smoothing_fwhm

## Create a Bunch object from the files in the EVs/ directory for the LR and RL files

# create a function that graps the eventsent files from a run and and returns them as a single dataframe
# NOTE! We have to import Bunch and pandas again because in nipype functions are closed environments
def get_events_bunch(data_dir,subject_id,phase_encoding,events_txt_list,events_names):

    import pandas as pd
    import os
    from nipype.interfaces.base import Bunch

    # create a path to the directory where the eventsent files are stored based on the input subject id
    subject_dir = f"{data_dir}{subject_id}/MNINonLinear/Results/tfMRI_WM_{phase_encoding}/EVs/"

    # initialize an empty list where the upcoming data frames will be stored
    events_dfs_list = []

    # read in the .txt file as pandas data frame,add a column 'trial_type' to give a description
    # and add the df to a list of dfs
    for idx,_ in enumerate(events_txt_list):
        events_df = pd.read_table(subject_dir + events_txt_list[idx],names=['onset','duration','amplitude'])
        events_df['trial_type'] = events_names[idx]
        events_dfs_list.append(events_df)

    # concatenate all dfs
    events_df = pd.concat(events_dfs_list,axis=0)

    # the ouput of our function will be passed over to SpecifyModel 
    # (https://nipype.readthedocs.io/en/1.6.0/api/generated/nipype.algorithms.modelgen.html#specifymodel)
    # This class requires the input to be a Bunch-Object so we have to convert the dataframe to Bunch-Type

    conditions = []
    onsets = []
    durations = []

    for name,group in events_df.groupby('trial_type'):
        conditions.append(name)
        onsets.append(group['onset'].tolist()) 
        durations.append(group['duration'].tolist())

    # the bunch object can contain more columns such as regressors (e.g. motion regressors?) that can be used to specfiy the model
    events_bunch = Bunch(conditions=conditions,
                          onsets=onsets,
                          durations=durations,
                          #amplitudes=None,
                          #tmod=None,
                          #pmod=None,
                          #regressor_names=None,
                          #regressors=None
                         )

    return events_bunch

# define a MapNode 
events_bunch_getter = MapNode(Function(input_names=['data_dir','subject_id','phase_encoding','events_txt_list','events_names'],
                                     output_names=['events_bunch'],
                                     function=get_events_bunch),
                            iterfield=['phase_encoding'],
                            name='events_bunch_getter')

events_bunch_getter.inputs.data_dir = data_dir
events_bunch_getter.inputs.phase_encoding = phase_encodings
events_bunch_getter.inputs.events_txt_list = events_txt_list
events_bunch_getter.inputs.events_names = events_names

## Set up the First-Level-Model for SPM

# SpecifyModel - Generates SPM-specific Model
model_specifier = Node(SpecifySPMModel(concatenate_runs=True,
                                       input_units='secs',
                                       output_units='secs',
                                       time_repetition=0.720,
                                       high_pass_filter_cutoff=200),
                       name='model_specifier')

# Level1Design - Generates an SPM design matrix
first_level_design = Node(Level1Design(bases={'hrf':{'derivs': [1,0]}},
                                       timing_units='secs',
                                       interscan_interval=0.720,
                                       model_serial_correlations='FAST'),
                          name='first_level_design')

first_level_design.inputs.mask_image = '/home/neuro/nipype_tutorial/brainmask_fs.2.nii'

# EstimateModel - estimate the parameters of the model
first_level_estimator = Node(EstimateModel(estimation_method={'Classical': 1}),
                             name='first_level_estimator')

## Define a DataSink Node where outputs should be stored

# define a DataSink node where files should be stored
# NOTE: The boolean parameterization = False ensures that the files are directly saved
# in the datasink folder. Otherwise for each subject there would be another folder (e.g. "_subject_123") created.
datasink = Node(DataSink(base_directory='/output',
                         parameterization=False),
                name="datasink")

## Connect all nodes to a workflow
# define workflow
wf = Workflow(name=workflow_name)
wf.base_dir = '/output/workflow_files'

# pass the subject id over to all nodes that need it
wf.connect(subject_iterator,'subject_id',run_selector,'subject_id')
wf.connect(subject_iterator,'subject_id',standardizer,'subject_id')
wf.connect(subject_iterator,'subject_id',events_bunch_getter,'subject_id')
wf.connect(subject_iterator,'subject_id',datasink,'container')

# connect all nodes that deal with the functional images
wf.connect(run_selector,'func',gunzipper,'in_file')
wf.connect(gunzipper,'out_file',standardizer,'in_file')
wf.connect(standardizer,'out_file',smooth,'in_files')

# connect all nodes that deal with SPM
wf.connect(smooth,'smoothed_files',model_specifier,'functional_runs')
wf.connect(events_bunch_getter,'events_bunch',model_specifier,'subject_info')
wf.connect(model_specifier,'session_info',first_level_design,'session_info')
wf.connect(first_level_design,'spm_mat_file',first_level_estimator,'spm_mat_file')

# # connect all nodes whose output should be stored to the datasink
wf.connect(first_level_design,'spm_mat_file',datasink,'nback.@spm')

### Visualize the workflow

# Create 1st-level analysis output graph
wf.write_graph(graph2use='flat', format='png', simple_form=True)

from IPython.display import Image
Image(filename='/output/workflow_files/nback_first_level_analysis/graph.png')

## Run the workflow 
wf_results = wf.run()

Platform details:

Same setup as in #3301:

{'commit_hash': '%h',
 'commit_source': 'archive substitution',
 'networkx_version': '2.5',
 'nibabel_version': '3.2.0',
 'nipype_version': '1.6.0-dev',
 'numpy_version': '1.19.2',
 'pkg_path': '/opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/nipype',
 'scipy_version': '1.5.2',
 'sys_executable': '/opt/miniconda-latest/envs/neuro/bin/python',
 'sys_platform': 'linux',
 'sys_version': '3.7.8 | packaged by conda-forge | (default, Jul 31 2020, '
                '02:25:08) \n'
                '[GCC 7.5.0]',
 'traits_version': '6.1.1'}

Execution environment

Same setup as in #3301

Using Michael Notter's nipype_tutorial (most-recent version miykael/nipype_tutorial:2020) running as a docker container on Windows 10 (+ WSL2, Ubuntu 18.04)

JohannesWiesner commented 3 years ago

Just for sanity-checking I deleted all cache-folders on both the Linux-Server and my Windows 10 PC and reran the script one more time. Interestingly, now I also got the beta-images on my PC (and they seem to be the same as on the server), but still with respect to the SpecifySPMModel node on the server I have files with proper size (1,5GB) on Windows still two files with 0KB