mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.71k stars 1.32k forks source link

Inconsistencies with tmin/tmax #2787

Closed choldgraf closed 8 years ago

choldgraf commented 8 years ago

There are some inconsistencies in inclusiveness for tmin/tmax in Epochs. When you create an epochs object using tmin/tmax, there's one extra index (on the tmin side of things) when you compare this with doing epochs.crop. E.g.:

import mne
from mne import io
from mne.datasets import sample

# Set parameters
data_path = sample.data_path()
raw_fname = data_path + '/MEG/sample/sample_audvis_raw.fif'
event_fname = data_path + '/MEG/sample/sample_audvis_raw-eve.fif'

# Setup for reading the raw data
raw = io.Raw(raw_fname)
events = mne.read_events(event_fname)
tmin, event_id = -1., 1
tmax = tmin + 1000. / raw.info['sfreq']
epochs = mne.Epochs(raw, events, event_id, tmin, tmax, baseline=None, preload=True)
print('{}...{} | {}'.format(epochs.times[:5], epochs.times[-5:], epochs.times.shape))
epochs.crop(tmin, tmax)
print('{}...{} | {}'.format(epochs.times[:5], epochs.times[-5:], epochs.times.shape))

gives

[-1.00064103 -0.99897607 -0.99731111 -0.99564615 -0.99398119]...[ 0.65765924  0.6593242   0.66098916  0.66265412  0.66431908] | (1001,)
[-0.99897607 -0.99731111 -0.99564615 -0.99398119 -0.99231623]...[ 0.65765924  0.6593242   0.66098916  0.66265412  0.66431908] | (1000,)

(+ a warning about tmax being too large)

I feel that tmin/tmax should behave the same in both cases. This probably isn't usually an issue unless the user needs to have index-level specificity (e.g., if they need a length 2**N signal)

agramfort commented 8 years ago

I would consider this a bug of the crop method....

choldgraf commented 8 years ago

OK - I think it comes down to whether you want behavior A or B:

tmin = 0
tmax = 1000 / sfreq
epochs.crop(tmin, tmax)

A: len(epochs.times) = 1000 (this is the current behavior of crop) B: len(epochs.times) = 1001 (this is the current behavior of the init method

agramfort commented 8 years ago

A makes more sense...

dengemann commented 8 years ago

Also would vote for A.

On Fri, Jan 15, 2016 at 1:19 PM, Alexandre Gramfort < notifications@github.com> wrote:

A makes more sense...

— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2787#issuecomment-171946634 .

jasmainak commented 8 years ago

epochs.crop appears to have a bug. If you look at epochs.times it removes the first index, that is epochs.times[0] is removed even though tmin=None. To me, this is buggy. So, A is buggy as well.

jaeilepp commented 8 years ago

I think is somewhat related to https://github.com/mne-tools/mne-python/pull/2657. I'm worried changing this might break some stuff (or at least be not as trivial to change as it seems), since stuff like first = int(round(epochs.tmin * info['sfreq'])) is done around the code. In this particular case:

tmin = -1.0
first = int(round(tmin * info['sfreq']))
print first

prints -601, whereas

first = int(round(epochs.tmin * epochs.info['sfreq']))
print first

prints -600 because epochs.tmin == -0.99897606579193932

larsoner commented 8 years ago

epochs.crop appears to have a bug. If you look at epochs.times it removes the first index, that is epochs.times[0] is removed even though tmin=None

In the first (top) snippet posted, he crops with tmin=-1 not tmin=None, and it removes the time point -1.00064103, which seems reasonable to me. Or did you mean something else?

larsoner commented 8 years ago

I think is somewhat related to #2657. I'm worried changing this might break some stuff

Yeah in some places we use floor-like behavior, and as you point out in others we use round-like behavior. For cropping I think we do greater-or-equal-to-lower-bound and less-than-or-equal-to-upper-bound, with some slop built in for floating point using _time_mask. I expect any changes we make to make these things more consistent will end up changing lots of little behaviors. But maybe it's worth it...?

jasmainak commented 8 years ago

okay, I see. So, it seems this happens because tmin is not equal to epochs.tmin. What do you think about adding an option to crop by samples? That way, these corner cases will be taken care of internally.

larsoner commented 8 years ago

What do you think about adding an option to crop by samples?

We could do that, but we would still want to address/fix why doing Epochs(tmin=a, tmax=b) != Epochs(tmin=a, tmin=b).crop(a, b) currently. So I think option A) takes care of that, so it gets my +1. Having those be consistent might obviate the need for having such sample-based cropping.

Sounds like we have at least three +1 for (A), any volunteer to try implementing it to see how much stuff it breaks?

jasmainak commented 8 years ago

We could do that, but we would still want to address/fix why doing Epochs(tmin=a, tmax=b) != Epochs(tmin=a, tmin=b).crop(a, b) currently.

But isn't this expected as you already pointed out because epochs.tmin != tmin. That is what is causing the offset. A is already what we have ... no?

larsoner commented 8 years ago

We don't have A) currently. A) is a proposed solution to the bug reported in the top comment. Basically right now we have Epochs(tmin=a, tmax=b) != Epochs(tmin=a, tmin=b).crop(a, b) and this can reasonably be considered a bug. Doing A) -- namely, setting the tmin in the Epochs constructor the same way it's set in the .crop() method -- will fix it.

jasmainak commented 8 years ago

You should do Epochs(tmin=a, tmax=b).crop() and that will be equal to Epochs(tmin=a, tmax=b).

larsoner commented 8 years ago

The argument is that we should have this behavior, and we don't:

Epochs(tmin=a, tmax=b) == Epochs(tmin=a, tmax=b).crop() == Epochs(tmin=a, tmin=b).crop(a, b)
jasmainak commented 8 years ago

I'm -1 on changing how it's done in the constructor. It will massively break things.

jasmainak commented 8 years ago

The argument is that we should have this behavior, and we don't:

Epochs(tmin=a, tmax=b) == Epochs(tmin=a, tmax=b).crop() == Epochs(tmin=a, tmin=b).crop(a, b)

I see that more as an issue with the crop method than the constructor ...

larsoner commented 8 years ago

It will massively break things.

I think we need to see how badly it will break things, and weigh that against the improved consistency. FWIW for data I analyze I expect it will have no effect because I generally use tmin/tmax that are round multiples of my sample rate, so the behavior should not change. It can cause 1-sample changes for other sample rates (e.g., 600.blah Hz that MGH uses). Hopefully no code depends critically on the inclusion of this single sample, or if it does, people can rectify it easily enough in their code with minor tweaks. So I'm +1 on trying it to see the impact.

I see that more as an issue with the crop method than the constructor ...

I think the way crop does it is better, though. It's also more consistent with how we treat times everywhere else I can think of in the package, too, since they mostly use _time_mask IIRC.

jasmainak commented 8 years ago

okay, let's wait for 3rd opinions then. I think the crop method is not consistent itself ... so that needs to be fixed in any case

larsoner commented 8 years ago

@agramfort @dengemann any change of opinion on if A) is still the way to go?

agramfort commented 8 years ago

yes A) is the way to go but I agree with @jasmainak that I'd rather change crop that the behavior of the constructor. The constructor is consistent with the behavior of the C code to AFAIK

larsoner commented 8 years ago

I think changing the behavior of crop is option B, so you're in favor of that?

larsoner commented 8 years ago

And conversely, changing the constructor is option A.

choldgraf commented 8 years ago

FWIW you can also get the crop method to cut off the tmax limit as well, with the right fudging of sfreq etc.

agramfort commented 8 years ago

I'd try B first then

larsoner commented 8 years ago

Yeah the problem is one (constructor) uses rounding and the other (_time_mask under the hood) doesn't, so you can get pathological behavior from either method (at either end) depending on what behavior you expect. The constructor effectively uses "within half a sample or greater", and time mask uses "within numerical precision or greater", at least for the tmin case. To make them consistent we have to decide which code to break.

larsoner commented 8 years ago

@dengemann are you also persuaded to use B now?

agramfort commented 8 years ago

so if crop crops at the nearest sample we recover the behavior of the init?

If so I feel it's a less invasive change to modify the behavior of crop.

larsoner commented 8 years ago

so if crop crops at the nearest sample we recover the behavior of the init?

Yes, we should.

If so I feel it's a less invasive change to modify the behavior of crop.

If we want to maintain consistency (i.e., all places that use tmin/tmax will behave similarly), then changing crop really means changing how _time_mask works.

It looks like the int(round()) method for determining time bounds is used in:

In other words, it's mostly related to Epochs.

The _time_mask method is used in:

I'm not really convinced from this that changing the behavior of crop / _time_mask is less invasive than changing the behavior of Epochs. But if maintaining compatibility with C is really paramount, and changing the behavior of a fundamental class constructor seems too drastic, then maybe it's worth changing _time_mask to act like Epochs.__init__. Or maybe we live with the inconsistency and do nothing...

FWIW, I don't think MNE-C is internally consistent about tmin/tmax all the time either -- I've had a single sample dropped off the end during dipole fitting when not providing tmin/tmax, probably due to this same issue.

larsoner commented 8 years ago

I'm not really convinced from this that changing the behavior of crop / _time_mask is less invasive than changing the behavior of Epochs

Thinking about it a bit further, there are often a number of steps that often happen after Epochs creation, so function use counts isn't a great metric. It might indeed end up being less intrusive to change _time_mask even though it touches all these functions. You win @jasmainak @agramfort I'll live with it :)

agramfort commented 8 years ago

yeah :)

you volunteer for the PR?

jasmainak commented 8 years ago

:) I think this is related to how the tmin is stored in the epochs object. If epochs.tmin was the same as the user-supplied tmin in the constructor, there would be no surprises. We don't store the user-supplied tmin anywhere do we? I would be happy with just a warning if the tmin for the crop method was the same as the tmin in the constructor of Epochs asking the user to supply None instead.

larsoner commented 8 years ago

We don't store the user-supplied tmin anywhere do we?

No, we don't store it anywhere.

I might have time to try a PR this week.

jasmainak commented 8 years ago

cool :)