brainhack-school2020 / stephaniealley_bhs2020_project

Effect of preprocessing on prediction performance of machine learning model
4 stars 0 forks source link

Nilearn masker.fit_transform input limitations #6

Open stephaniealley opened 4 years ago

stephaniealley commented 4 years ago

@illdopejake, I have a question regarding the Nilearn masker.fit_transform() function. From what I can see in the documentation, this function accepts NiftiImage objects or filenames as input. If I run my analysis (based on the ML tutorial) using the .tsv confound files provided with the data, for example, everything is fine. If I want to use a different collection of confounds, however, I have to load the .tsv file with pandas, alter the dataframe, and then write the dataframe back to file so that it can be loaded as input to masker.fit_transform. Do you know if there is a better way to do this?

illdopejake commented 4 years ago

Hi Stephanie,

Great question. So, as it turns out, the transform() and fit_transform() functions can take many different input types. It's in the documentation, but I admit its a bit buried. If you scroll down to the part addressing parameters of .transform() function, you'll see the following:

confounds: CSV file or array-like, optional This parameter is passed to signal.clean. Please see the related documentation for details. shape: (number of scans, number of confounds)

You can get more details in the nilearn documentation for signal.clean(), but the point is you can pass an "array-like" object to the confounds argument. If you already have a pandas dataframe (let's call it df), and it is formatted correctly (i.e. is a timepoints x confounds matrix), you can easily get this in an "array-like" format by calling df.values.

So, if you run masker.fit_transform(data,confounds=df.values), that should surmount this issue. I hope that all makes sense! Let me know if it doesn't :-)

stephaniealley commented 4 years ago

Ah, I see what you're saying. Thank you so much for the clarification. It works perfectly now.