mne-tools / mne-bids-pipeline

Automatically process entire electrophysiological datasets using MNE-Python.
https://mne.tools/mne-bids-pipeline/
BSD 3-Clause "New" or "Revised" License
139 stars 66 forks source link

BUG: ICA unsafely overwrites components tsv file #881

Open SophieHerbst opened 7 months ago

SophieHerbst commented 7 months ago

This used to be a config option, but I cannot find it. In the code it still says:

"""Apply ICA.

!! If you manually add components to remove, make sure you did not re-run the ICA in the meantime. Otherwise (especially if the random state was not set, or you used a different machine) the component order might differ. """

hoechenberger commented 7 months ago

yes

the ICA artifact detection step produces a TSV file where you can mark mark components as good or bad

this will be respected when applying ICA

SophieHerbst commented 7 months ago

Thank you! I just figured it out, too. Can we document this somewhere?

hoechenberger commented 6 months ago

Thank you! I just figured it out, too. Can we document this somewhere?

We emit a log message:

https://github.com/mne-tools/mne-bids-pipeline/blob/b52742e0b1a08deac963f59dafff4ad2d0578a62/mne_bids_pipeline/steps/preprocessing/_06a2_find_ica_artifacts.py#L322-L328

Additional documentation would be great, I'm just not sure where exactly to put it 🤔

SophieHerbst commented 6 months ago

Ok, thank you. I overlooked that one. I was searching for info on the website, I thought it would be a config option first.

SophieHerbst commented 6 months ago

Additional documentation would be great, I'm just not sure where exactly to put it 🤔

Can we mention it in the configuration options of apply_ica even though it is not a configuration option?

hoechenberger commented 6 months ago

Additional documentation would be great, I'm just not sure where exactly to put it 🤔

Can we mention it in the configuration options of apply_ica even though it is not a configuration option?

Yeah not sure tbh … Maybe here at the top?

https://mne.tools/mne-bids-pipeline/stable/settings/preprocessing/ssp_ica.html

larsoner commented 6 months ago

I'll document momentarily, but I think there is a bigger issue (which I'm hijacking this one to bring up since the PR I'm about to open will fix the doc issue anyway!). Currently on main we are in a bit of a precarious situation because a user could:

  1. Run the pipeline
  2. Mark some components in the TSV
  3. Change some param of the pipeline that affects ICA artifact detection or anything before it (like a filter setting or MF setting or whatever)
  4. Have their TSV changes silently get wiped, but not realize it

There are ways we could avoid this. For example we could add a new ica_require_inspection: bool = True that checks the .tsv to see if it has been validated by requiring user input, for example we could add a final row:

image

And emit an error in the apply_ica step if the last "component" hasn't been set to "good". This would ensure for example if the automated detection step or fitting step is re-run -- which overrwrites the file! -- that a user has noticed and revalidated which components are bad.

I could do that as part of this PR or a follow-up if it seems reasonable.

SophieHerbst commented 6 months ago

@larsoner does that mean that the user has to validate the ICA components manually? I am in favor of this, but for large datasets it might be tedious?

larsoner commented 6 months ago

@larsoner does that mean that the user has to validate the ICA components manually?

It was already suggested in some messages that users do it, though I've streamlined them in #899:

image

I am in favor of this, but for large datasets it might be tedious?

Agreed but in those cases I think people can do ica_require_inspection = False in their configs. The choice to make the new default True is because -- if manual inspection or marking of other artifacts should be required -- I think it's likely users are already getting burned by this overwriting, so it's more of a bugfix to make the default be "show you've validated it". And people who don't want to can easily add this new param with True to skip it.

SophieHerbst commented 6 months ago

sounds good to me!