bids-standard / pybids

Python tools for querying and manipulating BIDS datasets.
https://bids-standard.github.io/pybids/
MIT License
219 stars 121 forks source link

Support sparse acquisition #252

Open tyarkoni opened 5 years ago

tyarkoni commented 5 years ago

Currently all resampling proceeds in ignorance of image acquisition parameters. For designs involving sparse acquisition, post-convolution downsampling to the TR needs to integrate over the acquisition time, not the nominal repetition time. The simplest way to handle this is probably to add a sparse_acquisition argument in DenseRunVariable.resample() (True by default) that, in the event that the requested sampling rate is identical to the TR, zeroes out all regressor values outside of the acquisition blocks before resampling. (Note that this could conceivably produce strange results for some interpolation methods, so it may be necessary to either investigate systematically, or restrict to kind='linear' in np.interp1d.)

effigies commented 5 years ago

Fortunately this already exists, so I'm going to go ahead and tag in @satra and @mgxd.

From BIDS (see functional metadata), there are four ways to specify a sparse acquisition:

1) RepetitionTime + DelayTime 2) RepetitionTime + SliceTiming 3) VolumeTiming + SliceTiming 4) VolumeTiming + AcquisitionDuration

For the moment, we will only concern ourselves with (1) and (2), and assume a constant-interval series.

I think that perhaps the way to attack this is to go back to the run_info object and, while saving the TR, also calculate the TA as RepetitionTime - DelayTime or max(SliceTiming) + diff(SliceTiming) (where diff identifies the time between slices). (NB: On the whole, I think it is very rare that we'll need to concern ourselves with SliceTiming, as the data should be at least minimally preprocessed and SliceTiming converted to DelayTime by the preprocessing tool.)

Then get_design_matrix(sampling_rate='TR') should take the TA into account. It may be just passing TA through all the way down to DenseRunVariable.to_df(), or we might need some other trick.

satra commented 5 years ago

For designs involving sparse acquisition, post-convolution downsampling to the TR needs to integrate over the acquisition time, not the nominal repetition time.

indeed, the sampling rate is still TR, just the integration window is TA

effigies commented 5 years ago

Okay, I think this is probably going to require a complete rethinking of transforms.

satra commented 5 years ago

Okay, I think this is probably going to require a complete rethinking of transforms.

i hope not.

perhaps it would be good for us to have a quick chat online tomorrow? (can't do today).

effigies commented 5 years ago

Sure.

tyarkoni commented 5 years ago

Following up on our conversation yesterday, I think a reasonable implementation approach is something like the following:

satra commented 5 years ago

@tyarkoni and @effigies - i do not think this is at the level of X variables, but we can provide hints to the transforms on them. let me see if i can think aloud the different pieces of this puzzle.

we are concerned with the space of single volume delayed acquisitions where each sample has an onset and a duration, and that onsets repeat at a periodicity that is greater than duration of acquisition. A[silence]A[silence]

there are two other situations we won't presently cover in addressing this, but may be worth keeping in mind:

  1. clustered sparse acquisitions. a set of volumes acquired in a row: ABC[silence]ABC[silence]
  2. cardiac gated fMRI. here the sampling is non-uniform, since it is triggered in relation to the cardiac cycle. in both of these cases, the models would have to take into account non-uniformity of onsets and, for analysis, the physics of spin relaxation.

resampling to reflect Y applies to every variable in X. for example, if GSR sampled at 1234Hz, i would still need to think about how to resample that into the space of the acquisition. In this discussion, sparse is an acquisition related construct, so should apply to every variable when resampling to acquisition space, or if you want to pre-empt the process, whenever any resampling is occurring it can take hints (and possibly ignore them). this is not about sparse variables but about sparse acquisitions.

in the X space (at least for our current discussion), let's consider three kinds of variables: Var1. variables acquired alongside acquisition with uniform sampling (e.g., physio, EEG, GSR) Var2. variables acquired using non-uniform sampling (e.g., behavioral response, stimuli-related). Var3. variables derived from the MR imaging data (e.g., motion, noise estimates, outliers).

the key with Var2, is that these are not really sampled, they are almost analog events. they have an onset and a duration, not a sampling rate. this is what allows us to then do certain operations like convolution where we can represent the information at whatever sampling rate we want, and hence passing hints (such as set your sample rate to X) to these variables (i think this is what you are calling sparsevariable in pybids), make sense. but effectively this is still somewhat in analog space.

Var3 likely reflects the acquisition schemes of fMRI. Here we can have imaging volume related information at one end of the spectrum (sampled at TR) or slice related information (sampled at slice onset times).

Most common use cases generally use volume related information (e.g., motion, compcor, outliers, FD). currently fMRIPrep does not produce any slice-based information, but one can imaging that this may come up at some point.

now let's consider two transform operations:

  1. convolution: for convolution of Var2 like things, we are effectively giving it the optimal sr via setting dt (and this is what @effigies implemented).

  2. orthogonalization: this should still work fine for Var3 variables. those are already sampled at TR, and one would not need to know the acquisition scheme. for Var2 variables, one may need to take the integrator into account, but i think the delta in orthogonalization error may be minimal. if we consider what it means to orthogonalize two variables at different sampling rates, it's better to upsample (via interpolation) the lower sampling rate and then orthogonalize, rather than downsample and then orthogonalize.

finally, resampling for downsampling should consider effects of aliasing.

tyarkoni commented 5 years ago

@satra—I think the current implementation handles everything you mention, except for the transformation from X space to Y space in the case of sparse acquisition (which is what #376 will do) and anti-aliasing.

In general, I don't see a good reason to treat the X --> Y conversion as anything except a last step before design matrix output. There will of course be edge cases of the kind we've discussed, when, e.g., a user wants to orthogonalize a design variable with respect to motion confounds, but (a) this seems pretty niche; (b) it's arguably not an entirely sensible thing to do (the results you get from operations performed on discrete windows of a continuous time-series are not guaranteed to provide a good approximation of the continuous analysis, and often won't); and (c) technically none of this is actually supported in the BIDS-StatsModel spec, which says nothing (so far) about sparse acquisitions.

If we ignore operations that require both sparsely-acquired and densely-acquired inputs, and add some anti-aliasing functionality (whether explicit or implicit—see the other thread) is there anything else we're missing?

satra commented 5 years ago

In general, I don't see a good reason to treat the X --> Y conversion as anything except a last step before design matrix output.

i agree. the only other time this may become relevant is as a hint to the variables, actually more to the transforms, such that they can take the acquisition side into account.

If we ignore operations that require both sparsely-acquired and densely-acquired inputs, and add some anti-aliasing functionality (whether explicit or implicit—see the other thread) is there anything else we're missing?

i don't think at present. we will get a chance to do some parametric modulations (i.e. orth) on our sparse dataset in the coming week. so that will help us see if all of the pieces have come together.

and regarding anti-aliasing, i think we should just let the users know that currently this is not handled. filtering is a whole can of worms.

tyarkoni commented 5 years ago

i agree. the only other time this may become relevant is as a hint to the variables, actually more to the transforms, such that they can take the acquisition side into account.

This is already handled implicitly, so I don't think it's a problem. E.g., if you look at the Orthogonalize transformation, you find these class annotations:

class Orthogonalize(Transformation):

    _variables_used = ('variables', 'other')
    _densify = ('variables', 'other')
    _align = ('other')

The _densify and _align variables are base class directives that control how variables are preprocessed before being handed to the internal _transform call. In this case, alignment of onsets and durations is enforced (in the case of sparse events), and if a dense variable is involved, then all variables are converted to dense before orthogonalization. So I don't see a problem handling sparsely acquired variables—they would be dealt with in the same way (i.e., in addition to checking that sparse variables are all compatible, we would now have to check that dense variables are compatible, and convert from sparse dense to "real" dense if not).

(That said, the current transform logic in the base class is starting to look really convoluted, and is in need of some refactoring.)