mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.71k stars 1.32k forks source link

Improvements to the ICA API #10601

Open mscheltienne opened 2 years ago

mscheltienne commented 2 years ago

@adam2392 pinging as discussed.

Hello! Following the release of mne-icalabel, I'd like to propose a couple of changes to the ICA API. In mne-icalabel, we used label_components as the main entry-point:

from mne_icalabel import label_components

raw = ...
ica = ...
label_components(raw, ica, method='iclabel')

The name was chosen to be consistent with the methods in an ICA instance: get_components and plot_components. For now, the only available method argument is 'iclabel', but the goal is to add more.

-> Proposed change: add a label_components method to the ICA instance that takes as input self, a raw or epochs instance, and the name of the method to use to label components.


But instead of only stopping here and clogging a bit more the MNE API, I'd like to go a bit further. For now, the ICA instance has build-in simple methods to label cardiac, ocular, MEG ref, and muscular-related components: find_bads_eog, find_bads_ecg, find_bads_ref and the newly added find_bads_muscle. Each of those methods' outputs is very similar to the output of label_components: labels and scores/prediction probabilities.

-> Proposed change: move those simple labeling methods to mne-icalabel, deprecate the find_bads_ methods, and group all labeling methods under a simple ica.label_components method.

IMO, this change would simplify the API and move the maintenance of the related code and documentation to the mne-icalabel repository.


And finally, alongside those 2 changes, labels and scores/prediction probabilities could be stored in the ICA instance. For now, only the labels are stored in a dictionary self.labels_ where the key is the method/component type and the value is a list of components IDx that were labelled with this type. Related issue: https://github.com/mne-tools/mne-python/issues/9846

WDYT?

cbrnr commented 2 years ago

I like it a lot!

larsoner commented 2 years ago

Proposed change: move those simple labeling methods to mne-icalabel, deprecate the findbads methods, and group all labeling methods under a simple ica.label_components method.

I'm fine with moving this functionality to a unified ica.label_components function, but I'd actually like to 1) keep one-line wrappers for find_bads_*, and 2) advertise that new code should use the unified function. Every deprecation means pain for our users, and I think we can avoid it easily here, especially for code that has been around for 8 years. The maintenance overhead for us in keeping it is low, and I think if we adapt tutorials + docstrings to advertise the new label_components function, users won't be too confused by our the violation of "one way" principle.

hoechenberger commented 2 years ago

Yes please no API changes here. Moving the code outside of MNE is a great idea though.

agramfort commented 2 years ago

let's not rush here. Let mne-icalabel get a bit of visibility and user feedback and then let's reevaluate in a few months.

Message ID: @.***>

larsoner commented 2 years ago

Moving the code outside of MNE is a great idea though.

Does MNE-ICLabel require PyTorch? If that's the case, given @hoechenberger's difficulties with installing PyTorch in https://github.com/mne-tools/mne-installers/pull/123 as it's quite a big dependency, I'm -1 on depending on MNE-ICLabel for core ICA functionality in MNE-Python. We should only depend on it for ICLabel functionality.

In other words, let's add ica.label_components in MNE-Python, keep our find_bads_* one-line wrappers for trivial backward compat (no need for deprecation), and just add a small shim in label_components to call to mne-iclabel when a user requests that mode.

mscheltienne commented 2 years ago
  1. Yes of course let's not rush here, the goal is to get opinions for now.
  2. Also yes, let's definitely not have PyTorch set as a hard dependency, especially for only one function. When we move the code from find_bads_* to mne-icalabel, we'll set pytorch as a soft dependency for ICLabel, with a warning if it's not installed.
larsoner commented 2 years ago

When we move the code from findbads* to mne-icalabel

My proposal above is actually not to do this, but to have as much ica.label_components logic in MNE-Python as possible/reasonable. But maybe I don't understand the end goal...

I am thinking that if a user uses the new API to do ica.label_components('ecg', ...) this will be 100% equivalent to ica.find_bads_ecg(...), and thus not need any iclabel-specific stuff. In other words, it will use the ECG channel in the data to find components as it does now, and does not require any mne-iclabel to be installed.

Then if a user does ica.label_components('ecg', method='iclabel', ...) then it tries to import mne-iclabel and use it to label.

mscheltienne commented 2 years ago

Oh, yes we can do it this way as well. I was proposing to move all IC-labelling stuff to mne-icalabel while removing the hard-dependency for torch from mne-icalabel. This way it can become a core dependency of mne, while not needing any additional dependencies to run.

So compare to your proposal, the idea is to import mne-icalabel (same dep. as mne-python) to use ica.label_components(), and in mne-icalabel, import optional dependencies as torch (or any other extra dependency specific to this model), if needed only.

larsoner commented 2 years ago

I was proposing to move all IC-labelling stuff to mne-icalabel while removing the hard-dependency for torch from mne-icalabel. This way it can become a core dependency of mne, while not needing any additional dependencies to run.

I like this proposal a bit less because if a user does pip install --upgrade mne now ICA is broken for them unless they install mne-iclabel as well. I also like the idea of making the scope of mne-iclabel to do labeling using iclabel, not all forms of labeling. It seems more appropriate, especially since find_bads_ecg and such have been around for so long. Then mne-iclabel can focus on just iclabel functionality, and not have to test all the find_bads_* functions that MNE-Python currently does already.

adam2392 commented 2 years ago

So it sounds like the only change we should do is https://github.com/mne-tools/mne-python/issues/9846?

Is there a consensus there? What are the next steps to enable storage of "scores" within the ICA labels?

mscheltienne commented 2 years ago

And add the label_components and the one-line wrappers. I'm surprised pip install --upgrade pkg doesn't install new dependencies for the updated version, but if that is the case, then yes moving the existing code and tests to mne-icalabel only brings downsides.

larsoner commented 2 years ago

I'm surprised pip install --upgrade pkg doesn't install new dependencies for the updated version

It would, but then we'd have to add another hard dependency.

Dependencies aside, though, I like the idea of keeping mne-iclabel's functionality focused on iclabel stuff as opposed to things like find_bads_* that MNE-Python already provides. It makes it clear that core ICA functionality lives in MNE, including label_components, and that the scope of mne-iclabel is iclabel-specific stuff.

mscheltienne commented 2 years ago

So, what is 'iclabel-specific'? We also want to include different models to label components, not just ICLabel. That's why I proposed to move the code from the find_bads_* to the other repo, as in the end they can be considered as simple models to label components.

But anyway, that was just an additional proposal to move some of the code from mne-python to a sub-repo. It's not that important, the core is really label_components and regrouping all labeling methods (inc. find_bads_*) in label_components, no matter where the code of find_bads_* lives.

hoechenberger commented 2 years ago

So, what is 'iclabel-specific'? We also want to include different models to label components, not just ICLabel. That's why I proposed to move the code from the find_bads_* to the other repo, as in the end they can be considered as simple models to label components.

Let's do it the other way around, then: ICALabel could internally call the MNE find_bads_* methods and expose this as, say, method='mne'.

The package depends on MNE already, so all would be good, I suppose!

larsoner commented 2 years ago

We also want to include different models to label components, not just ICLabel

label_components(..., method='iclabel') should definitely be in mne-iclabel. I'm not sure what other methods there are, but they could live in iclabel or MNE-Python, either way is fine by me. If they're already in mne-iclabel, they could stay.

To me the default method='channels' (or whatever we think is a good name for what find_bads_* does currently) should live in MNE-Python and continue to work with no other dependencies.

Let's do it the other way around, then:

This could work, too, I suppose. But it sounds like you're saying that we shouldn't add ica.label_components, but that seemed like a nice API unification point.

agramfort commented 2 years ago

Installing pytorch with installer seems ambitious unless you accept to only install a CPU version. but then you'll have people complaining to have a slow pytorch if they do anything else. I don't know if the installer can detect at runtime if cuda is available and if so install a GPU version. It seems a bit of work...

Message ID: @.***>

agramfort commented 2 years ago

yes for icalabel cpu is just what you need but if folks want to train deep net in the MNE environment it would be bad. Not sure what to recommend.

Message ID: @.***>

mscheltienne commented 2 years ago

yes.. thought about it after, you are right.

hoechenberger commented 2 years ago

yes for icalabel cpu is just what you need but if folks want to train deep net in the MNE environment it would be bad. Not sure what to recommend.

If we include ICALabel, I wouldn't mention the availability of PyTorch as long as we only have CPU support.

Capability detection & package selection at installation time is possible (CPU vs CUDA); the auto-selection is already part of the conda-forge package; the tricky (?) part is to fire up the package installation in a post-installation script after the installer has extracted everything else. It's all doable but I expect this to take a while until we'll get it to work well on all platforms. But this effort would come with the bonus point that once we have such a mechanism in place, we can in theory also ship packages from PyPI etc… I cannot work on this for the time being, though, as I'm busy with other stuff I'm afraid