BaseRaw + io module - Githubissues

agramfort commented 10 years ago

I'd like to summarize what I feel needs to be done on the BaseRaw class + io module. Here are the few steps I would take

rename mne.fiff to mne.io
mv mne/fiff/evoked.py to mne/evoked.py. It's much more natural to type mne.read_evoked etc...
mv mne/fiff/pick.py to mne/pick.py
add all the necessary deprecation warnings
make sure all examples still run
update all examples
make sure the doc still builds

then introduce a BaseRaw class in mne/io/base.py , define the interface and update the base class in every object in the Raw family

do I forget something?

any volunteer? :)

dengemann commented 10 years ago

@agramfort I'll take care of it over the holidays.

dengemann commented 10 years ago

I'm on it now

dengemann commented 10 years ago

@agramfort now that I'm in the middle of my edits it occurs to me how inconsistent it feels to have mne.Evoked, mne.Epochs but not mne.Raw., it really looks stupid next to each other in any example. Also I think it would be cool to be able to say from mne import Raw, Epochs, Evoked I think io should really be about converting, low level reading and writing. My 2 cents.

agramfort commented 10 years ago

well the meaning of io is raw io in my mind.

the io module should contain only the BaseRaw and Raw* classes.

larsoner commented 10 years ago

@agramfort it would be strange to me to have a submodule named mne.io, but not have things like read_evoked, read_source_estimate, etc. not also be under it. That is what I had been assuming would happen -- all our I/O-related functions would be grouped under that module. Sorry I didn't bring it up earlier when you first proposed this...

dengemann commented 10 years ago

@agramfort I think we agree on this. We simply should expose the most frequently used callables in mne. To me this would be Raw but not Base classes or conversion classes. It really looks kind of unpractical in an example to import io only to access the Raw constructor while mne.Epochs and mne.Evoked works. Same for pick_types ... cleanup is overdue

agramfort commented 10 years ago

@Eric89GXL if we move all the read_* functions in io then we move many many things there and need to split existing files / classes. Would you move the source_estimate.py file there and remove the SourceEstimate class from mne namespace? to me it would be too much.

agramfort commented 10 years ago

I have in mind to type:

mne.io.read_raw_fif mne.io.read_raw_edf mne.io.read_raw_kit

and

picks = mne.pick_types(...)

On Tue, Dec 24, 2013 at 5:55 PM, Denis A. Engemann <notifications@github.com

wrote:

@agramfort https://github.com/agramfort I think we agree on this. We simply should expose the most frequently used callables in mne. To me this would be Raw but not Base classes or conversion classes. It really looks kind of unpractical in an example to import io only to access the Raw constructor while mne.Epochs and mne.Evoked works. Same for pick_types... cleanup is overdue

— Reply to this email directly or view it on GitHubhttps://github.com/mne-tools/mne-python/issues/1003#issuecomment-31178748 .

dengemann commented 10 years ago

I know that I suggested this. But thinking about it it while editing this I think a data object constructors should have a consistent top-level API. flat is better than nested :-) (don't reply with namespaces are a honking great idea).

On Dec 24, 2013, at 5:59 PM, Alexandre Gramfort notifications@github.com wrote:

I have in mind to type:

mne.io.read_raw_fif mne.io.read_raw_edf mne.io.read_raw_kit

and

picks = mne.pick_types(...)

On Tue, Dec 24, 2013 at 5:55 PM, Denis A. Engemann <notifications@github.com

wrote:

@agramfort https://github.com/agramfort I think we agree on this. We simply should expose the most frequently used callables in mne. To me this would be Raw but not Base classes or conversion classes. It really looks kind of unpractical in an example to import io only to access the Raw constructor while mne.Epochs and mne.Evoked works. Same for pick_types... cleanup is overdue

— Reply to this email directly or view it on GitHubhttps://github.com/mne-tools/mne-python/issues/1003#issuecomment-31178748 .

— Reply to this email directly or view it on GitHub.

larsoner commented 10 years ago

@agramfort that makes sense. Do you think we should not expose Raw as a way of reading raw files, but instead have people use read_raw_fif (e.g., for the FIF case)? @dengemann one disadvantage to having mne.Raw is that it would break this pattern, and make FIF-raw reading "exceptional" in how it's accessed/used.

larsoner commented 10 years ago

I think it makes the most sense to deprecate direct instantiation of raw using Raw(...) and instead have people use read_raw_fif, since it's more explicit. And we can do the same thing with Evoked(...) versus read_evoked. This seems the cleanest, most consistent way to do it to me. I'm fine with having these be under mne or mne.io.

dengemann commented 10 years ago

i'm fine with functions, but it really should be consistent. no historical excuses ;-) also I don't see strong reasons to withhold the most commonly used functions from being exposed as top-level functions. usability over taxonomic desires, please.

dengemann commented 10 years ago

I won't continue here before we reach consensus, this PR is too fidgety otherwise. Let's shoot for a post-holiday / xmas hangout if necessary.

larsoner commented 10 years ago

The only remaining debate is over whether to put the read_raw_* functions under mne or mne.io, right?

dengemann commented 10 years ago

The only remaining debate is over whether to put the readraw* functions under mne or mne.io, right?

and how many of all other read_* functions should follow, e.g. read_ica, etc.

likewise, how much of pick to expose in mne name space.

dengemann commented 10 years ago

I'd prefer mne to io (read already makes the purpose explicit) but consistency would even be more important to me. Now that we have the chance to change it we should stop scattering io functions across subpackages ...

mluessi commented 10 years ago

I think it would be good to have all file-reading functions in a separate sub-package, so I vote for either using mne.io.read_raw_fif etc. or mne.io.fiff.read_raw. The former would be a flatter hierarchy, the later would be better if we ever decide to be able to things other than raw from non-fif files. E.g., we would use mne.io.fiff.read_epochs() or mne.io.h5.read_epochs(). Having a separate sub-package for each file type will make it more obvious what read/write operations are supported (esp. when e.g. using auto-completion in IPython).

Regarding having a BaseRaw class in mne/io/base.py. I think we should put it in mne/raw.py, this will be more consistent with other types (Evoked etc.).

dengemann commented 10 years ago

I'm confused. Let me try to do some score-keeping.

so I vote for either using mne.io.read_raw_fif etc. or mne.io.fiff.read_raw

currently the thing fiff is contrasted with is not hdf5, etc., but formats from other MEG vendors using a suffix notation, not a general prefix notation. I'm not sure what the solution might be here. Maybe:

mne.io.fiff.read_raw
mne.io.kit.read_raw
mne.io.hdf5.read_raw
mne.io.hdf5_ctf_from_my_lab.read_raw

# and so on?

The former would be a flatter hierarchy, the later would be better if we ever decide to be able to things other than raw from non-fif files.

I agree. but currently this is not in sight. hdf5 might be the next thing cf. NEO support. We should definitely chose an API that does not block such future developments. on the other hand side we should not give too much weight to thing things we (Y)AGN(I). Admittedly, in this case is a bit different in that we try to find an API that is both maximum general and simple ....

Having a separate sub-package for each file type will make it more obvious what read/write operations are supported (esp. when e.g. using auto-completion in IPython).

well auto-completion works equally well with underscores ;-)

Regarding having a BaseRaw class in mne/io/base.py. I think we should put it in mne/raw.py, this will be more consistent with other types (Evoked etc.).

I'm trying to wrap up. So you (@mluessi) would like to have a base class in mne but all readers spreaded across format specific modules (prefix notation). @agramfort and @Eric89GXL want do deprecate constructors but basically converge on io semantics using a suffix notation. I would like to have either functions or constructors (preferably functions) in way that is as consistent and flat as possible without excuses (read_ica, read_stc, read_raw in one place OR in their dedicated packages, but not one top-level, the other nested). But I would actually have a strong preference for top level readers because read already indicates the purpose ( io ) while implementations and low level things should live in a dedicated io module. All great packages have general readers (numpy, scipy, pandas) while special tools live in nested io modules --- stuff you only want to get in, mostly read only. I'm really afraid that we'r about to do something really bureaucratic or too smart.

I guess we need postpone this to a point where all of us can meet in a hangout.

mluessi commented 10 years ago

I don't feel very strongly about it, I'm fine with either mne.io.read_raw_fif etc. or mne.io.fiff.read_raw. I just wanted to point out that the latter has potential advantages if we decide to support non-fiff formats for writing at some point.

agramfort commented 10 years ago

It looks like we really need to hangout to converge... let's plan this for after the holidays. I'm back at work on jan 6

larsoner commented 10 years ago

Jan 6th or later works for me, but I am not too passionate about it so I'm okay if you can work it out without me (esp. since I am the limiting time factor).

maedoc commented 10 years ago

@dengemann pointed me here to discuss how to create bipolar montages for stereotactic EEG, for which I'm currently manipulating the info and _data attributes on a Raw instance directly. He pointed out that the idiomatic way to do this is with an appropriate subclass.

While we are simply recomputing channel data and changing labels, bipolar sEEG is, technically, the use of adjacent channels as references, but I didn't see any way to do that in MNE currently.

Does this fit somewhere into the above discussion?

agramfort commented 10 years ago

can you share a gist of the "hack" you do with the info and the _data field? we recently added a method to set the EEG reference in Raw object, maybe we can do something for sEEG too.

dengemann commented 10 years ago

@mmwoodman indeed not all aspects of the above discussion match your use case. But the general topic BaseRaw is essentially on how to support custom data in MNE-Python objects that don't match the existing schemes. A gist would be helpful to see how you read in you sEEG data and how you do the referencing.

maedoc commented 10 years ago

I have four steps from the raw bti data import

l. 16 pick EEG channels
l. 58 pick ['chs'][:]['kind']==2 as after picking EEG, some MEG chs remained
l. 93 sort channels by name
l. 151 bipolarize channels

Sorry for the excessive length, it's exported from a notebook I'm working on.

dengemann commented 10 years ago

@mmwoodman that's indeed very helpful. It helps me understanding that the io is not the crucial part of your problem since you apparently record the sEEG with the 4D Magnes system. Maybe it's a separate issue but we could probably add parts of your code to the BTi converter and / or add a designated sEEG montage function ... to be discussed.

mluessi commented 10 years ago

Any chance the bipolar montage could be computed using an SSP projector? This would be more efficient and convenient.

maedoc commented 10 years ago

I'm not familiar with SSP, but a (sparse) projection can easily be constructed to compute the bipolar data.

mluessi commented 10 years ago

@mmwoodman if you can write it as a projection it may work. Have a look here for the EEG average reference projection:

https://github.com/mne-tools/mne-python/blob/master/mne/fiff/proj.py#L524

Using an SSP would have the advantage that you wouldn't need to use preloading and you can switch on/off the projection later (e.g. when plotting evoked responses).

maedoc commented 10 years ago

@mluessi thanks for the tip. Are these the projections in raw.info['projs']? Can they be sparse? I had to work with this by hand today because the picks= functionality didn't manipulate the projections correctly and the arrays (at least for the EEG average) are dense.

The preloading and switching on/off is indeed useful as we often see bipolar/monopolar as complementary, not exclusive, so I'll look into writing an appropriate function.

larsoner commented 10 years ago

Why do you need them to be sparse? I don't anticipate hitting memory problems, and it may or may not be faster depending on whether sparse matrix multiplication in scipy is actually faster than MKL + multithreaded BLAS in numpy...

It seems like if all you're doing is subtracting pairs of channels, it should be possible with a single matrix. Whether or not this can be represented as a projection vector, I'm not sure.

maedoc commented 10 years ago

They don't need to be sparse; 'twas but a curiosity. The number of sEEG electrodes doesn't exceed ~60, at least in our case due to the amplifier.

It does require a matrix, e.g. for a single implantation it would be (diag(ones((n, ))) - diag(ones((n-1, )), -1))[:-1].

agramfort commented 10 years ago

using the SSP mechanism seem complicated

I see a few things that could be done:

add an sEEG channel type in constants.FIFF and add the corresponding seeg param to the pick_types functions in pick.py
create a function like apply_montage (for which we would define a good API to be defined)

dengemann commented 10 years ago

In addition we could add a bti function for renaming / setting channel types.

maedoc commented 10 years ago

I opened #1070 to continue sEEG discussion and avoid derailing the original discussion here.

dengemann commented 10 years ago

@agramfort where would you put the RawBase?

agramfort commented 10 years ago

In fiff/base.py

On 27 janv. 2014, at 09:50, "Denis A. Engemann" notifications@github.com wrote:

@agramfort where would you put the RawBase?

— Reply to this email directly or view it on GitHub.

dengemann commented 10 years ago

@agramfort I'll open a PR later such that we can discuss the interface the ABCMeta is supposed to dictate.

mluessi commented 10 years ago

Regarding being able to read non-fiff formats for things other than Raw, we are currently discussing read/write functions for FieldTrip data (Epochs, Evoked, etc.), which btw is the reason for https://github.com/mne-tools/mne-matlab/pull/9 which will allow FT to read Epochs. So we should keep this in mind when designing the new io module.

dengemann commented 10 years ago

@mluessi I was also thinking about about .to_fieldtrip etc methods ...

dengemann commented 10 years ago

@agramfort @Eric89GXL @mluessi ...

To simplify convergence, could we please vote what to include in the interface definition:

`print '\n - [ ] '.join(dir(raw))``

[ ] "contains"
[ ] "del"
[ ] "delattr"
[ ] "dict"
[ ] "doc"
[ ] "enter"
[ ] "exit"
[ ] "format"
[ ] "getattribute"
[x] "getitem"
[ ] "hash"
[ ] "init"
[ ] "len"
[ ] "module"
[ ] "new"
[ ] "reduce"
[ ] "reduce_ex"
[ ] "repr"
[ ] "setattr"
[ ] "setitem"
[ ] "sizeof"
[ ] "str"
[ ] "subclasshook"
[ ] "weakref"
[ ] "_add_eeg_ref"
[ ] "_first_samps"
[ ] "_initialize_fids"
[ ] "_last_samps"
[x] "_orig_comp_grade"
[ ] "_parse_get_set_params"
[ ] "_preload_data"
[x] "_preloaded"
[x] "_projector"
[ ] "_raw_lengths"
[ ] "_read_raw_file"
[ ] "_read_segment"
[x] add_proj
[x] append
[x] apply_function
[x] apply_hilbert
[x] apply_proj
[x] as_data_frame
[x] cals
[x] ch_names
[x] close
[x] comp
[x] copy
[x] crop
[x] del_proj
[x] drop_channels
[ ] estimate_rank
[x] fids
[x] filter
[x] first_samp
[x] index_as_time
[x] info
[x] last_samp
[ ] load_bad_channels
[x] n_times
[x] notch_filter
[ ] orig_format
[x] plot
[x] plot_psds
[x] proj
[ ] rawdirs
[x] resample
[x] save
[x] time_as_index
[x] to_nitime
[x] verbose

mluessi commented 10 years ago

@dengemann I think for saving, maybe we could have a format kwarg in the save methods.

mluessi commented 10 years ago

@dengemann ?? I don't get it..

dengemann commented 10 years ago

@mluessi vote for the ABCMeta methods ;-)

dengemann commented 10 years ago

@agramfort @mluessi a few of the attributes are computed during runtime. We would need to make them abc.abstractproperties to make them serve as interface definitions. Another question is which things shall already be implemented in RawBase as concrete methods ...

Any thoughts?

mluessi commented 10 years ago

@agramfort why do you suggest to put the RawBase class into fiff/base.py? Maybe I'm misunderstanding something, but isn't the point of this PR to make IO less fiff-centric? I think it would make more sense to have the base class in mne/raw.py and format specific readers in the .py file for the respective file format.

@dengemann are you trying to decide which methods/attributes from Raw should go into BaseRaw? I think almost everything except the things that are fiff-specific (read block, etc). You want as many methods as possible in the base class so all the features are available in all derived classes (which implement few format specific methods for reading and writing).

agramfort commented 10 years ago

whatever method is not specific to fiff files :)

agramfort commented 10 years ago

first we create the base class then we refactor the io module to move stuff from mne/fiff to mne/ ok?

mluessi commented 10 years ago

@agramfort +1 let's create the base class and the derived class for fiff first.

Given that we are now planing to implement readers for FT, do you guys think it is still the best idea to implement a flat hierarchy or would hierarchical be better? I.e., should we use mne.io.read_epochs_fieldtrip etc or put things for each file format in a separate sub-package, i.e., mne.io.fieldtrip.read_epochs. IMO, the later would allow for a cleaner organization and one could use from mne.io.fieldtrip import * for less typing.

agramfort commented 10 years ago

fine with mne.io.fieldtrip

mne-tools / mne-python

BaseRaw + io module #1003