Closed dengemann closed 8 years ago
Is this with the goal of being able to compute TFR and retain individual trial traces? Very much so +1 to this :) for ECoG anyway, it'd be nice to be able to just calculate a full TFR and save it do disk for multiple uses later.
tfrs = [tfr_XXX(epochs[i]) for i in range(len(epochs))] then you need a write tfr that can write multiple ones...
but maybe it's not so easy...
I agree that it's doable, though seems strange to get single trial tfr
functionality by using the AverageTFR
in a non-intuitive way. Either way,
for me a question is how to handle the representation of the result. E.g.,
one way to save would be to combine the results into a big dataframe and
save it to HDF5, but maybe there are better ways of handling the output of
the data (it seems like using fif format wouldn't make sense anymore)
On Fri, May 8, 2015 at 1:32 AM, Alexandre Gramfort <notifications@github.com
wrote:
tfrs = [tfr_XXX(epochs[i]) for i in range(len(epochs))] then you need a write tfr that can write multiple ones...
but maybe it's not so easy...
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-100152526 .
I was thinking about something as boring as a designated function returning a designated object, some kind of EpochsTFR with .average method.
Saving the data to HDF5 and loading it will be almost trivial once the object is defined. You just dump the object properties to a dict
and save, and repopulate on load.
Out of curiosity, why wasn't this the original implementation instead of
creating an AverageTFR
class. It seems like it would have been more in
line with MNE structures (e.g., create an Epochs object, and then give an
average
function to create an Evoked object). Are there issues with
memory or something?
On Fri, May 8, 2015 at 9:09 AM, Eric Larson notifications@github.com wrote:
Saving the data to HDF5 and loading it will be almost trivial once the object is defined. You just dump the object properties to a dict and save, and repopulate on load.
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-100282966 .
yes it's a memory issue. In TF you blow up the dimension of the data
yeah that's the same problem I've ran into before. I guess the upside is that you can often decimate TFR representations though, no? (I've never used these methods for things like coherence or phase estimate though...maybe that'd mess up?)
On Fri, May 8, 2015 at 10:48 AM, Alexandre Gramfort < notifications@github.com> wrote:
yes it's a memory issue. In TF you blow up the dimension of the data
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-100310142 .
decimation helps and most people need the average that's why we chose this design. But we can always improve it....
yeah that definitely makes sense. all the current behavior could remain the
same if need be anyway, no? You could just use "average_epochs" as a flag
to the various tfr_XXX
functions and it could return an AverageTFR object
in that case. Otherwise, it would return an EpochsTFR. Or maybe it would
make more sense to call AverageTFR
something like EvokedTFR
to keep it
in line with the current epochs naming.
On Fri, May 8, 2015 at 12:59 PM, Alexandre Gramfort < notifications@github.com> wrote:
decimation helps and most people need the average that's why we chose this design. But we can always improve it....
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-100341570 .
an average param in the tfr_XXX function would do it indeed. EpochsTFR sounds good
EpochsTFR
That would be important to do deconding and stats across trials.
Yeah, I first thought about this when I was looking into the time generalization stuff as well as from this example [1] where single_trial_power
is used to get the full TFR for one electrode.
It seems that the code in single_trial_power is close to what would be needed anyway, just would need to let it use more than morlet wavelets, and to output the EpochsTFR class, no?
[1] http://martinos.org/mne/stable/auto_examples/stats/plot_cluster_1samp_test_time_frequency.html
just would need to let it use more than morlet wavelets, and to output the EpochsTFR class, no?
yes, basically that's it.
Hey all - just a quick list of some important stuff to consider from a chat with me and @dengemann:
Epochs
objectsRandom update - running CWT_wavelet on individual epochs and storing them as a complex128 array is taking up ~8GB of disk space for me. This is with roughly 200 trials, 60 channels, 50 freqs, and 600 timepoints. I just wrote a simple class to handle I/O for me right now, which basically just does this:
class EpochTFR(object):
def __init__(self, data, freqs, events, times, info):
self.data = data
self.freqs = freqs
self.events = events
self.times = times
self.info = info
def write_hdf(self, fname):
print('Writing TFR data to {0}'.format(fname))
write_dict = {'data': self.data, 'freqs': self.freqs,
'events': self.events, 'times': self.times,
'info': self.info}
mne._hdf5.write_hdf5(fname, write_dict, overwrite=True)
@staticmethod
def from_hdf(fname):
out = mne._hdf5.read_hdf5(fname)
etfr = self.__init__(out['data'], out['freqs'], out['events'],
out['times'], out['info'])
return etfr
This is after downsampling the signal from 1000Hz to 100Hz (post-TFR decomp). I haven't really found any good information about how much it's "ok" to downsample after you've got the TFR though.
looks useful to me
I should say, the morlet wavelet CWT finished in about 10 minutes, so that's not a trivial amount of time, though not a huge amount of time either.
Personally it would be useful to be able to just load full epoched TFR information into memory so I can do branching-off analyses from there. Certainly saves time as you have more and more datasets (especially since you currently can't use joblib for any of this due to memory constraints).
What other functionality should be in a class like this? I can imagine:
crop
, pick_channels
, etc)I was thinking it would be useful to just have this as a way of representing the data, and then use derivation classes to do more computational stuff (e.g., expect the user to turn this into an Epochs object, then run analysis, rather than having analysis code operate directly on EpochTFR). What do people think?
Here's what I have now...worth opening a PR?
class EpochTFR(object):
"""
Create an EpochTFR object for single-trial TFR representation.
This allows you to store TFR information without averaging across
trials. Be careful with memory usage, as it can get quite large.
Consider decimating before this step.
Parameters
----------
data : ndarray, shape(n_epochs, n_chans, n_freqs, n_times)
The TFR data you wish to construct.
freqs : ndarray, shape(n_freqs)
The frequency labels during TFR construction
times : ndarray, shape(n_times)
The time labels during epochs creation
events : ndarray, shape (n_epochs, 3)
An MNE events array, see documentation for details
event_id : dict
A mapping from event names to integer event IDs
info : mne Info object
The info object from the original Epochs instance.
May be unnecessary but keep it just in case.
method : string
Information describing how this TFR data was created.
Attributes
----------
ch_names : list
The names of the channels
"""
def __init__(self, data, freqs, times, events, event_id, info, method=None):
if data.ndim != 4:
raise ValueError('data should be 4d. Got %d.' % data.ndim)
n_epochs, n_channels, n_freqs, n_times = data.shape
if n_channels != len(info['chs']):
raise ValueError("Number of channels and data size don't match"
" (%d != %d)." % (n_channels, len(info['chs'])))
if n_freqs != len(freqs):
raise ValueError("Number of frequencies and data size don't match"
" (%d != %d)." % (n_freqs, len(freqs)))
if n_times != len(times):
raise ValueError("Number of times and data size don't match"
" (%d != %d)." % (n_times, len(times)))
if n_epochs != events.shape[0]:
raise ValueError("Number of epochs and data size don't match"
" (%d != %d)." % (n_epochs, events.shape[0]))
self.data = data
self.freqs = freqs
self.events = events
self.event_id = event_id
self.times = times
self.info = info
self.method = method
@property
def ch_names(self):
return self.info['ch_names']
def write_hdf(self, fname, **kwargs):
"""
Write the object to an HDF file.
This bottles up all object attributes into a dictionary,
and sends it to an HDF file via the h5py library.
Parameters
----------
fname : str
The file name of the hdf5 file.
**kwargs : dict
Arguments are passed to write_hdf5
"""
print('Writing TFR data to {0}'.format(fname))
write_dict = {'data': self.data, 'freqs': self.freqs,
'events': self.events, 'times': self.times,
'info': self.info, 'event_id': self.event_id}
mne._hdf5.write_hdf5(fname, write_dict, **kwargs)
def subset_freq(self, fmin, fmax):
"""
Return the mean of a subset of frequencies as an Epochs object.
Parameters
----------
fmin : float
The minimum frequency to keep.
fmax : float
The maximum frequency to keep.
"""
mask_freq = (self.freqs > fmin) * (self.freqs < fmax)
data_freq = self.data[:, :, mask_freq, :].mean(2)
epoch_info = self.info.copy()
epoch_info['description'] = "{'kind': 'TFRFreqSubset', 'freqs': '{0}-{1}'.format(fmin, fmax)}"
return mne.EpochsArray(data_freq, epoch_info, self.events,
tmin=self.times.min(), event_id=self.event_id)
def subset_epoch(self, epoch_ixs):
"""
Return the mean of a subset of epochs.
Parameters
----------
epoch_ixs : list, ndarray
The epoch indices to keep
Returns
----------
obj : AverageTFR
Instance of AverageTFR with epochs averaged.
"""
data_epoch = self.data[epoch_ixs, ...].mean(0)
epoch_info = self.info.copy()
return mne.time_frequency.AverageTFR(epoch_info, data_epoch, self.times,
self.freqs, len(epoch_ixs), comment='TFREpochSubset',
method=self.method)
def crop(self, tmin=None, tmax=None, copy=False):
"""Crop data to a given time interval
Parameters
----------
tmin : float | None
Start time of selection in seconds.
tmax : float | None
End time of selection in seconds.
copy : bool
If False epochs is cropped in place.
"""
inst = self if not copy else self.copy()
mask_time = mne.utils._time_mask(self.times, tmin, tmax)
inst.times = inst.times[mask_time]
inst.data = inst.data[..., mask_time]
return inst
def __repr__(self):
s = "time : [%f, %f]" % (self.times[0], self.times[-1])
s += ', epochs: {0}'.format(self.events.shape[0])
s += ", freq : [%f, %f]" % (self.freqs[0], self.freqs[-1])
s += ', channels : %d' % self.data.shape[0]
return "<EpochTFR | %s>" % s
@staticmethod
def from_hdf(fname):
"""
Read an EpochsTFR from an HDF5 file.
This expects an HDF5 file that was created with
write_hdf5. It auto-populates a new EpochsTFR object.
Be careful with memory consumption!
Parameters
----------
fname : str
The path to the HDF5 file you wish to import.
Returns
-------
etfr : EpochsTFR
The EpochsTFR object contained in the HDF5 file.
"""
params = mne._hdf5.read_hdf5(fname)
etfr = EpochTFR(**params)
return etfr
what is needed is a way to generate such objects. The API of this object is next.
Cool - should this kind of code be roped into a PR that includes functions that return single-trial TFR? Seems like it could get confusing to have them separate. I don't have time to work on changing the TFR functions in earnest right now...maybe something I can work on while in France?
Otherwise, somebody else is more than welcome to take the code above and roll it into their own PR. I'm just going to use this locally for data I/O and creating TFR by iterating through epochs[i] with cwt_morlet, which is kinda hacky but still works.
let's keep this for the sprint in Paris this summer
That still happening? Last I heard you guys were waiting on funding. Either way I've booked my flights: into paris on the 6th, out on the 28th.
On Wed, Jun 3, 2015 at 12:55 PM, Alexandre Gramfort < notifications@github.com> wrote:
let's keep this for the sprint in Paris this summer
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-108596062 .
we have the funding now :)
Wohoo! :moneybag:
Is it possible to get single trial phase as well? Does something like 'tfrs = [tfr_XXX(epochs[i]) for i in range(len(epochs))]' exists for phase that I could use in the sort term? Could this kind of feature be added to the single trial TFR code if nothing exists?
Right now the way I'm using it on my own is to store the values as complex 64 bit numbers, that way you can calculate either the phase or amplitude. That said, right now there isn't any kind of tangible PR so that's just my own anecdotal use :)
On Thu, Jul 23, 2015 at 11:28 PM, Matt Erhart notifications@github.com wrote:
Is it possible to get single trial phase as well? Does something like 'tfrs = [tfr_XXX(epochs[i]) for i in range(len(epochs))]' exists for phase that I could use in the sort term? Could this kind of feature be added to the single trial TFR code if nothing exists?
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-124244819 .
Are you going into the various TF functions and changing a line like 'TFR_abs = np.abs(TFR)' so you can do abs or angle later? I could manage that for now if that's all. 'Complex', 'Angle', or 'Phase' might make for a useful flag. Might be relevant to the CFC branch as well.
Right now I'm just pulling stuff from the data
attribute, which is a
matrix of shape(n_epochs, n_channels, n_freqs, n_times). I'll just do
something like:
amp = np.abs(tfr.data[:, 10, ...])
or
phase = np.angle(tfr.data[:, 10, ...])
That said, I think that if this were implemented it'd warrant some discussion about how to handle memory usage etc.
On Fri, Jul 24, 2015 at 1:04 AM, Matt Erhart notifications@github.com wrote:
Are you going into the various TF functions and changing a line like 'TFR_abs = np.abs(TFR)' so you can do abs or angle later? I could manage that for now if that's all. 'Complex', 'Angle', or 'Phase' might make for a useful flag. Might be relevant to the CFC branch as well.
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-124262070 .
Thanks for your help. The tfr you're referring to comes from your EpochTFR class above and not something currently in mnepy?
Yup, it's just my own code for personal uses right now. You can find it all here...it's relatively straightforward but has a lot of improvements to be made if it were to be an official addition to MNE.
Hey all - reviving this because I'm working through my own EpochTFR code again and thought it might be worth revisiting.
Would this still be useful for MNE? Right now, I have a function that extracts the TFR of each trial and puts it into an EpochTFR
object that keeps it all together and reads/writes from HDF5.
def tfr_epochs(epoch, freqs_tfr, n_decim=10, n_cycles=4):
"""Extract a TFR representation from epochs w/o averaging.
This will extract the TFR for the given frequencies, returning
an EpochsTFR object with data shape (n_epochs, n_chans, n_freqs, n _times).
TFR is performed with a morlet wavelet.
Parameters
----------
epoch : mne.Epochs instance
The epochs for which to extract the TFR
freqs_tfr : array-like
The frequencies to extract w/ wavelets
n_decim : int
The factor to decimate the time-dimension. 1=no decimation
n_cycles : int
The number of cycles for the wavelets
"""
# Preallocate space
new_times = epoch.times[::n_decim]
new_info = epoch.info.copy()
new_info['sfreq'] /= n_decim
new_epoch = epoch.copy()
n_epoch = len(new_epoch)
out_shape = (n_epoch, len(epoch.ch_names),
freqs_tfr.shape[0], len(new_times))
tfr = np.zeros(out_shape, dtype=np.float32)
print('Preallocating space\nshape: {0}\nmbytes:{1}'.format(
out_shape, tfr.nbytes/1000000))
# Do TFR decomp
print('Extracting TFR information...')
pbar = ProgressBar(n_epoch)
for i in range(n_epoch):
iepoch = new_epoch[i]
idata = iepoch._data.squeeze()
itfr = mne.time_frequency.cwt_morlet(idata, iepoch.info['sfreq'],
freqs_tfr, n_cycles=n_cycles)
itfr = itfr[:, :, ::n_decim]
tfr[i] = np.abs(itfr)
pbar.update_with_increment_value(1)
etfr = EpochTFR(tfr, freqs_tfr, new_times, new_epoch.events,
new_epoch.event_id, new_info)
return etfr
I'm currently throwing away phase information just because it takes up a lot of RAM / disk space, and IO was becoming a problem on our cluster. It'd be easy to add back in though.
Just pinging this again to see if there's still interest. I was looking at the time_frequency code and the three functions (stockwell, morlet, and multitaper) seem to be doing pretty much the same thing. Though for some reason stockwell does it differently. The other two call a function _time_frequency, but stockwell seems to be using its own version of this (maybe for good reason?).
If all these functions used the same underlying code, I don't know that it'd be too tough to modify things to create functions that return 4D arrays. Basically just modify _time_frequency
to output either averages over epochs or the full 4D array. Something like:
def _time_frequency(X, Ws, use_fft, decim, average=True):
"""Aux of time_frequency for parallel computing over channels
"""
n_epochs, n_times = X.shape
n_times = n_times // decim + bool(n_times % decim)
n_frequencies = len(Ws)
mode = 'same'
if use_fft:
tfrs = _cwt_fft(X, Ws, mode)
else:
tfrs = _cwt_convolve(X, Ws, mode)
tfrs = tfrs[..., ::decim]
if average is True:
psd = np.zeros((n_frequencies, n_times)) # PSD
plf = np.zeros((n_frequencies, n_times), np.complex) # phase lock
for tfr in tfrs:
tfr_abs = np.abs(tfr)
psd += tfr_abs ** 2
plf += tfr / tfr_abs
psd /= n_epochs
plf = np.abs(plf) / n_epochs
else:
tfrs = tfrs[..., ::decim]
tfr = np.abs(tfrs) ** 2
plf = tfrs / np.abs(tfrs)
return psd, plf
Then, you could write some functions like _tfr_power_morlet
, _tfr_power_multitaper
etc, that would just compute the windows and pass the data to _time_frequency
. It wouldn't be so different from what single_trial_power
does, but it seems like it would set the stage for more extendable/modular functions, and would make it easier to build on top of it if a full EpochsTFR
object were created. Anyone have thoughts on that?
@choldgraf just to summarize the discussion in two lines, you want to have an EpochsTFR
class which gives you AverageTFR
when you do tfr.average()
? And you want some way to unify single_trial_power
with tfr_xxx
function?
I think I will veto any progress here until there is better documentation for time_frequency ;) We need something that explains what we have currently. Also, common idioms such as tfrs = [tfr_XXX(epochs[i]) for i in range(len(epochs))]
should go in there. It will give everyone a clearer picture of what we need.
that's fine with me, I need to spend time on other things than MNE developemnt :)
I totally agree with documentation, I think it's pretty confusing. What did you have in mind? Just docstrings, or examples? And just using the tfr_ functions or also things like cwt_morlet?
On Thu, Dec 31, 2015 at 3:58 AM, Mainak Jas notifications@github.com wrote:
@choldgraf https://github.com/choldgraf just to summarize the discussion in two lines, you want to have an EpochsTFR class which gives you AverageTFR when you do tfr.average() ? And you want some way to unify single_trial_power with tfr_xxx function?
I think I will veto any progress here until there is better documentation for time_frequency ;) We need something that explains what we have currently. Also, common idioms such as tfrs = [tfr_XXX(epochs[i]) for i in range(len(epochs))] should go in there. It will give everyone a clearer picture of what we need.
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-168159912 .
I think a manual page. That's what I had in mind. Take a look at this for example: http://scikit-learn.org/stable/modules/linear_model.html. We need more like these. Something that gives the user a little background on the method (with a few equations if applicable), some nice plots, code snippets on how to generate them in python along with warnings / notes + references. Basically, a complete overview. I think you have a pretty good understanding of the time-frequency stuff. So, it would be great if you can contribute.
Cool - thanks for the reference links. I'll put this on my to-do list though it'll be a little bit because that PSD PR kind of sucked away all of my non-thesis working time for a little while...
On Fri, Jan 1, 2016 at 2:52 AM, Mainak Jas notifications@github.com wrote:
I think a manual page. That's what I had in mind. Take a look at this for example: http://scikit-learn.org/stable/modules/linear_model.html. We need more like these. Something that gives the user a little background on the method (with a few equations if applicable), some nice plots, code snippets on how to generate them in python along with warnings / notes + references. Basically, a complete overview. I think you have a pretty good understanding of the time-frequency stuff. So, it would be great if you can contribute.
— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-168294341 .
Extending the @agramfort's efforts in #3476 : would it make sense to extend to slightly modify EpochsTFR
to be more similar to Epochs
?
events
infrastructure
events
fieldevents_id
indexing mechanismget_data()
method and ditch the data
field?For (2): This would make EpochsTFR
differ from AverageTFR
, but Epochs
differs from Evoked
in exactly the same way, so I'm not sure if that's really a problem.
@palday +1 for both 1) and 2). Can you open a separate issue/PR?
@dengemann can you close this issue? I think we solved it with #3454 and #3476?
Done.
+1 for both as well - and isn't there a PR w a more general get_data
On Thu, Sep 1, 2016 at 7:59 AM Phillip Alday notifications@github.com wrote:
Done.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/2024#issuecomment-244106623, or mute the thread https://github.com/notifications/unsubscribe-auth/ABwSHVkyl2mjSTo2C-xdFvSV33mZZPjQks5qluhlgaJpZM4EJHBu .
+1 too
the generic get_data
is _get_inst_data
in mne.utils
- but it is not public.
Looks like remaining issues have been moved to a new issue, closing
Continues discussion from #1780.