naturalistic-data-analysis / naturalistic_data_analysis

A jupyter book for the OHBM educational workshop on analyzing naturalistic data.
http://naturalistic-data.org/
40 stars 9 forks source link

Issue on page /content/Preprocessing.html #4

Open yibeichan opened 3 years ago

yibeichan commented 3 years ago

Hi there,

we are experiencing an issue when trying to execute the post-processing pipeline. First, there appears to be a typo when loading the file_list: file_list = [x for x in glob.glob(os.path.join(base_dir, '*/func/*preproc*gz')) if 'denoised' not in x]

Here, "denoised" should be denoise, correct?

Second, when executing the below code:

import os
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from nltools.stats import regress, zscore
from nltools.data import Brain_Data, Design_Matrix
from nltools.stats import find_spikes 
from nltools.mask import expand_mask

def make_motion_covariates(mc, tr):
    z_mc = zscore(mc)
    all_mc = pd.concat([z_mc, z_mc**2, z_mc.diff(), z_mc.diff()**2], axis=1)
    all_mc.fillna(value=0, inplace=True)
    return Design_Matrix(all_mc, sampling_freq=1/tr)

base_dir = '/srv/lab/fmri/tutorials/Sherlock/fmriprep'

fwhm=6
tr = 1.5
outlier_cutoff = 3

file_list = [x for x in glob.glob(os.path.join(base_dir, '*/func/*preproc*gz')) if 'denoise' not in x]
f = file_list[0]
sub = os.path.basename(f).split('_')[0]

data = Brain_Data(f)
smoothed = data.smooth(fwhm=fwhm)

spikes = smoothed.find_spikes(global_spike_cutoff=outlier_cutoff, diff_spike_cutoff=outlier_cutoff)
covariates = pd.read_csv(glob.glob(os.path.join(base_dir, sub, 'func', '*tsv'))[0], sep='\t')
mc = covariates[['trans_x','trans_y','trans_z','rot_x', 'rot_y', 'rot_z']]
mc_cov = make_motion_covariates(mc, tr)
csf = covariates['csf'] # Use CSF from fmriprep output
dm = Design_Matrix(pd.concat([csf, mc_cov, spikes.drop(labels='TR', axis=1)], axis=1), sampling_freq=1/tr)
dm = dm.add_poly(order=2, include_lower=True) # Add Intercept, Linear and Quadratic Trends

smoothed.X = dm
stats = smoothed.regress()
stats['residual'].data = np.float32(stats['residual'].data) # cast as float32 to reduce storage space
stats['residual'].write(os.path.join(base_dir, sub, 'func', f'{sub}_denoise_smooth{fwhm}mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz'))

We get the following error when accessing the output of the "stats" object:

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    375                 if cls in self.type_pprinters:
    376                     # printer registered in self.type_pprinters
--> 377                     return self.type_pprinters[cls](obj, self, cycle)
    378                 else:
    379                     # deferred printer

/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/IPython/lib/pretty.py in inner(obj, p, cycle)
    605             p.pretty(key)
    606             p.text(': ')
--> 607             p.pretty(obj[key])
    608         p.end_group(step, end)
    609     return inner

/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    698     """A pprint that just redirects to the normal repr function."""
    699     # Find newlines and replace them with p.break_()
--> 700     output = repr(obj)
    701     lines = output.splitlines()
    702     with p.group():

/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/nltools/data/brain_data.py in __repr__(self)
    233             self.X.shape,
    234             os.path.basename(self.mask.get_filename()),
--> 235             self.file_name,
    236         )
    237 

... last 1 frames repeated, from the frame below ...

/usr/local/miniconda3/envs/naturalistic/lib/python3.7/site-packages/nltools/data/brain_data.py in __repr__(self)
    233             self.X.shape,
    234             os.path.basename(self.mask.get_filename()),
--> 235             self.file_name,
    236         )
    237 

RecursionError: maximum recursion depth exceeded

Any idea/suggestion where this error emanates from? Thanks much! YC

ljchang commented 3 years ago

Hi @Yibeichan, thanks for sharing your bugs you discovered.

Good catch on the first one, we will fix that right away.

The second one is just because there is too much data in the dictionary. Nothing should be wrong there, data.keys() should give you all of the variables stored in the dictionary and data['key'] should give you the corresponding values. We are planning on adding a stats results class to make this more straightforward in the future.

yibeichan commented 3 years ago

Wow, cool, cool, cool. Thank you!

yibeichan commented 3 years ago

Hi @ljchang , just have one more question here. I'm trying to use the n=50 parcellation, but there is no full information of all nodes on Neurosynth. Do you know where I can find it? Thank you! YC