Open tyarkoni opened 6 years ago
Fortunately this already exists, so I'm going to go ahead and tag in @satra and @mgxd.
From BIDS (see functional metadata), there are four ways to specify a sparse acquisition:
1) RepetitionTime
+ DelayTime
2) RepetitionTime
+ SliceTiming
3) VolumeTiming
+ SliceTiming
4) VolumeTiming
+ AcquisitionDuration
For the moment, we will only concern ourselves with (1) and (2), and assume a constant-interval series.
I think that perhaps the way to attack this is to go back to the run_info
object and, while saving the TR, also calculate the TA as RepetitionTime - DelayTime
or max(SliceTiming) + diff(SliceTiming)
(where diff
identifies the time between slices). (NB: On the whole, I think it is very rare that we'll need to concern ourselves with SliceTiming
, as the data should be at least minimally preprocessed and SliceTiming
converted to DelayTime
by the preprocessing tool.)
Then get_design_matrix(sampling_rate='TR')
should take the TA into account. It may be just passing TA through all the way down to DenseRunVariable.to_df()
, or we might need some other trick.
For designs involving sparse acquisition, post-convolution downsampling to the TR needs to integrate over the acquisition time, not the nominal repetition time.
indeed, the sampling rate is still TR, just the integration window is TA
Okay, I think this is probably going to require a complete rethinking of transforms.
Okay, I think this is probably going to require a complete rethinking of transforms.
i hope not.
perhaps it would be good for us to have a quick chat online tomorrow? (can't do today).
Sure.
Following up on our conversation yesterday, I think a reasonable implementation approach is something like the following:
Add a SparselyAcquiredDenseRunVariable
to the Variable
hierarchy. This inherits from DenseRunVariable
but adds attributes for timing parameters and/or helper methods needed to manipulate the data and/or convert it to a standard DenseRunVariable
.
Implement the sparse acquisition --> full dense representation conversion with its own Transformation
. This would be undocumented and not included in the spec, and the expectation is that users would rarely if ever call it directly.
When get_design_matrix
is called, we add a type check on all variables, and any SparselyAcquiredDenseRunVariable
instances detected are first converted to ordinary DenseRunVariable
via the new transformation. Thereafter, everything proceeds as normal.
We will not (at least for now) support edge case transformations like orthogonalizing a DenseRunVariable
with respect to a SparselyAcquiredDenseRunVariable
. But we should try to generate a comprehensible exception message if users attempt this.
@tyarkoni and @effigies - i do not think this is at the level of X
variables, but we can provide hints to the transforms on them. let me see if i can think aloud the different pieces of this puzzle.
we are concerned with the space of single volume delayed acquisitions where each sample has an onset and a duration, and that onsets repeat at a periodicity that is greater than duration of acquisition. A[silence]A[silence]
there are two other situations we won't presently cover in addressing this, but may be worth keeping in mind:
resampling to reflect Y applies to every variable in X. for example, if GSR sampled at 1234Hz, i would still need to think about how to resample that into the space of the acquisition. In this discussion, sparse is an acquisition related construct, so should apply to every variable when resampling to acquisition space, or if you want to pre-empt the process, whenever any resampling is occurring it can take hints (and possibly ignore them). this is not about sparse variables but about sparse acquisitions.
in the X space (at least for our current discussion), let's consider three kinds of variables: Var1. variables acquired alongside acquisition with uniform sampling (e.g., physio, EEG, GSR) Var2. variables acquired using non-uniform sampling (e.g., behavioral response, stimuli-related). Var3. variables derived from the MR imaging data (e.g., motion, noise estimates, outliers).
the key with Var2, is that these are not really sampled, they are almost analog events. they have an onset and a duration, not a sampling rate. this is what allows us to then do certain operations like convolution where we can represent the information at whatever sampling rate we want, and hence passing hints (such as set your sample rate to X) to these variables (i think this is what you are calling sparsevariable in pybids), make sense. but effectively this is still somewhat in analog space.
Var3 likely reflects the acquisition schemes of fMRI. Here we can have imaging volume related information at one end of the spectrum (sampled at TR) or slice related information (sampled at slice onset times).
Most common use cases generally use volume related information (e.g., motion, compcor, outliers, FD). currently fMRIPrep does not produce any slice-based information, but one can imaging that this may come up at some point.
now let's consider two transform operations:
convolution: for convolution of Var2 like things, we are effectively giving it the optimal sr via setting dt (and this is what @effigies implemented).
orthogonalization: this should still work fine for Var3 variables. those are already sampled at TR, and one would not need to know the acquisition scheme. for Var2 variables, one may need to take the integrator into account, but i think the delta in orthogonalization error may be minimal. if we consider what it means to orthogonalize two variables at different sampling rates, it's better to upsample (via interpolation) the lower sampling rate and then orthogonalize, rather than downsample and then orthogonalize.
finally, resampling for downsampling should consider effects of aliasing.
@satra—I think the current implementation handles everything you mention, except for the transformation from X
space to Y
space in the case of sparse acquisition (which is what #376 will do) and anti-aliasing.
In general, I don't see a good reason to treat the X --> Y conversion as anything except a last step before design matrix output. There will of course be edge cases of the kind we've discussed, when, e.g., a user wants to orthogonalize a design variable with respect to motion confounds, but (a) this seems pretty niche; (b) it's arguably not an entirely sensible thing to do (the results you get from operations performed on discrete windows of a continuous time-series are not guaranteed to provide a good approximation of the continuous analysis, and often won't); and (c) technically none of this is actually supported in the BIDS-StatsModel spec, which says nothing (so far) about sparse acquisitions.
If we ignore operations that require both sparsely-acquired and densely-acquired inputs, and add some anti-aliasing functionality (whether explicit or implicit—see the other thread) is there anything else we're missing?
In general, I don't see a good reason to treat the X --> Y conversion as anything except a last step before design matrix output.
i agree. the only other time this may become relevant is as a hint to the variables, actually more to the transforms, such that they can take the acquisition side into account.
If we ignore operations that require both sparsely-acquired and densely-acquired inputs, and add some anti-aliasing functionality (whether explicit or implicit—see the other thread) is there anything else we're missing?
i don't think at present. we will get a chance to do some parametric modulations (i.e. orth) on our sparse dataset in the coming week. so that will help us see if all of the pieces have come together.
and regarding anti-aliasing, i think we should just let the users know that currently this is not handled. filtering is a whole can of worms.
i agree. the only other time this may become relevant is as a hint to the variables, actually more to the transforms, such that they can take the acquisition side into account.
This is already handled implicitly, so I don't think it's a problem. E.g., if you look at the Orthogonalize
transformation, you find these class annotations:
class Orthogonalize(Transformation):
_variables_used = ('variables', 'other')
_densify = ('variables', 'other')
_align = ('other')
The _densify
and _align
variables are base class directives that control how variables are preprocessed before being handed to the internal _transform
call. In this case, alignment of onsets and durations is enforced (in the case of sparse events), and if a dense variable is involved, then all variables are converted to dense before orthogonalization. So I don't see a problem handling sparsely acquired variables—they would be dealt with in the same way (i.e., in addition to checking that sparse variables are all compatible, we would now have to check that dense variables are compatible, and convert from sparse dense to "real" dense if not).
(That said, the current transform
logic in the base class is starting to look really convoluted, and is in need of some refactoring.)
Currently all resampling proceeds in ignorance of image acquisition parameters. For designs involving sparse acquisition, post-convolution downsampling to the TR needs to integrate over the acquisition time, not the nominal repetition time. The simplest way to handle this is probably to add a
sparse_acquisition
argument inDenseRunVariable.resample()
(True
by default) that, in the event that the requested sampling rate is identical to the TR, zeroes out all regressor values outside of the acquisition blocks before resampling. (Note that this could conceivably produce strange results for some interpolation methods, so it may be necessary to either investigate systematically, or restrict tokind='linear'
innp.interp1d
.)