0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

How to deal with missing values in a time series #93

Closed ChrisBuckley12 closed 5 years ago

ChrisBuckley12 commented 5 years ago

I am using SPM as a method to assess physical activity patterns throughout the day. Due to the nature of my data, it is possible to have missing values that if substituted by 0 would skew my results. Also, I believe inserting NaN is not accounted for in the available SPM scripts. Please, can you tell me whether there is a correct way to replace missing timepoints in a time series? Previously I have considered filling the gaps using an average of the value in the series prior and proceeding the gap. However, I am not sure if this is a valid method to be used for SPM, also this cannot be used where it is the first or last sample of my time series that is missing. Any help here would be greatly appreciated.

0todd0000 commented 5 years ago

If just a few points are missing you might want to try an interpolation algorithm like those in scipy.interpolate https://docs.scipy.org/doc/scipy/reference/interpolate.html When just a few points are missing interp1d with kind='cubic' will likely work. https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html#scipy.interpolate.interp1d

If many points in a row are missing and/or if interpolation algorithms produce unsatisfactory results, one option would be to conduct nonparametric inference. Please find a similar discussion here: https://github.com/0todd0000/spm1dmatlab/issues/93