Closed SophieHerbst closed 5 months ago
This first cached line is for the epochs
┌────────┬ preprocessing/_08a_apply_ica ────────────────────────────────────────────────────────────────────────
│13:51:36│ ✅ sub-155 Computation unnecessary (cached) …
The thresholds affect fitting ICA, so you need to rerun that step and the applying step
Thanks for the reply @larsoner. I don't understand why the thresholds affect the fitting. I meant those parameters: ica_ctps_ecg_threshold = 0.05 ica_eog_threshold = 2.0
The thresholds affect fitting ICA, so you need to rerun that step and the applying step
Isn't that a bug then? Shouldn't we auto-rerun ICA fitting if config values related to the ICA step change?
I don't understand why the thresholds affect the fitting.
They don't, but fitting and ExG artifact detection are performed in the same step (run-ica), hence that step needs to be re-run. Maybe it should be split into two steps, a) fitting and b) artifact detection
I thought that the whole point of having two separate files for run and apply ICA was to run the ICA once and be able to adjust the thresholds for eog/ecg rejection without recomputing ICA
that is how it was in 'the old pipeline': https://github.com/brainthemind/CogBrainDyn_MEG_Pipeline/blob/master/06a-apply_ica.py Did it get lost?
Isn't that a bug then? Shouldn't we auto-rerun ICA fitting if config values related to the ICA step change?
To detect this in the general case this would require traversing the whole processing tree. In other words, if a user selects a single step, don't treat it like "just run this" but instead like a stopping step, "run everything up to this". Like if a user changed their filter settings but only asks to run source estimation, quite a few preceding steps actually need to be rerun.
I thought that the whole point of having two separate files for run and apply ICA was to run the ICA once and be able to adjust the thresholds for eog/ecg rejection without recomputing ICA
The report generated through run-ica is supposed to help you determine which components to remove in the next step (apply-ica)
To assist you, we run the automated artifact detection
The results end up together with all components in the ica+components report, e.g. http://mne.tools/mne-bids-pipeline/stable/examples/ERP_CORE/sub-015_ses-ERN_task-ERN_proc-ica+components_report.html
The apply-ica step then looks into the ICA component CSV file to figure which components you would like removed
Like I said above, I think your point of separating the artifact detection step from the fitting step is fair. I think we should do, in three steps:
so you could fiddle with artifact detection settings (step 2) without having to re-fit ICA (step 1)
To detect this in the general case this would require traversing the whole processing tree. In other words, if a user selects a single step, don't treat it like "just run this" but instead like a stopping step, "run everything up to this".
So would would have been a viable solution for @SophieHerbst now?
Simply re-run the entire preprocessing
pipeline and rely on caching to skip e.g. filtering etc?
So would would have been a viable solution for @SophieHerbst now? Simply re-run the entire
preprocessing
pipeline and rely on caching to skip e.g. filtering etc?
We don't have a "please run everything up to this step" functionality right now, do we? Do you think we could implement this somehow?
And why is the run-ica step before make_epochs? As I see from the code it creates epochs and runs ICA on them.
And why is the run-ica step before make_epochs? As I see from the code it creates epochs and runs ICA on them.
ICA creates an entirely separate set of epochs to allow for applying different filters before fitting (e.g., 1 Hz high-pass for ICA while you'll maybe want 0.1 Hz high-pass for your "actual" data)
I was just confused to see that it appears before epochs in the steps and that the output talks about the raw runs. But I get it now.
I was just confused to see that it appears before epochs in the steps and that the output talks about the raw runs. But I get it now.
Yes so it iterates over the raw files in order to create epochs
it's actually a good sign this caught your attention – Eric just recently changed the order of things to what it is now to highlight the fact that ICA is not fitted to your "regular" epochs
but it appears this needs better documentation and / or logging output
but it appears this needs better documentation and / or logging output
And the steps in the documentation need to be renamed accordingly
Like I said above, I think your point of separating the artifact detection step from the fitting step is fair. I think we should do, in three steps:
Yes this is doable. We could probably consider renaming the steps a bit so we don't have to keep renaming the steps that follow if possible (like with 05a1, 05a2 or something instead of 05a, 06a but only a 05a for SSP and no 06a at all)
I was just confused to see that it appears before epochs in the steps and that the output talks about the raw runs. But I get it now.
Yes so it iterates over the raw files in order to create epochs... it's actually a good sign this caught your attention...
Yes this is exactly why I reordered it, I initially also thought ICA was fit on the task epochs but it's not :slightly_smiling_face:
So would would have been a viable solution for @SophieHerbst now? Simply re-run the entire preprocessing pipeline and rely on caching to skip e.g. filtering etc?
Yes personally nowadays I always just mne_bids_pipeline config.py
anytime I change anything and rely on caching to skip stuff. Data on SSDs this only has ~10s overhead for ~20 subjects data. Maybe it's not as viable on NFS-stored data or something -- @SophieHerbst can you give some information on how long it takes mne_bids_pipeline config.py
to run on your data without changing anything, i.e., to be a no-op that just says "cached" for every step? Maybe the mtime
checks are expensive or something.
We don't have a "please run everything up to this step" functionality right now, do we? Do you think we could implement this somehow?
Yeah it would be possible in principle to implement this, I'll put this in a new issue
And the steps in the documentation need to be renamed accordingly
@SophieHerbst what would you like to work on, if anything? Looks like we have
I'm happy to do any of these
@larsoner I am happy to give both a try but likely will need some supervision to get it right. Also, I might not get to it before next week, so if you have time to start go ahead. For the documentation, can you point me to how the website is updated?
Docs are all built from https://github.com/mne-tools/mne-bids-pipeline/tree/main/docs/source and also the contents of https://github.com/mne-tools/mne-bids-pipeline/blob/main/mne_bids_pipeline/_config.py , so depends on what you need to update. I would use git grep
with a few words from the offending text to find the right file
Upon reflecting a bit... I think we should probably change both docs/source/features/steps.md
and docs/source/features/overview.md
to be generated using Python code. We can use the module __doc__
(ideally the title, but could be some other private var). @SophieHerbst okay for me to do this? That way you don't have to manually update anything and we'll automatically stay up-to-date...
Hah, looks like we already do that with steps.md
. So we need to do it with features.md
as well...
Since the doc part is in #860 now (I hope), I re-titled this issue to be about splitting ICA into three steps:
ICA.fit
(which is done on task epochs, but they're separate from the ones eventually used because filtering differs)ICA.apply
ICA.fit (which is done on task epochs, but they're separate from the ones eventually used because filtering differs)
yes this is what I meant to say, sorry if there was any confusion!
@hoechenberger @larsoner do you think a new release could be done with the changes in ICA? Then I could run with the stable version once more, hand over to the student, and hopefully not touch preprocessing for this dataset again.
I'd like to get #863 in, then I don't see why not :+1:
somebody would need to implement the "ICA splitting" first 😅
I can tackle that this week I think, I don't think it'll be too bad
sorry, I didn't have the time to contribute to this. let me know I can test anything
I wanted to re-run the apply_ICA step after changing the thresholds for automatic component rejection in my config.
I had three surprises/ observations:
@hoechenberger how do I rerun this step? If the other points are just a matter of changing the doc, I can do it with a few how-to pointers from you
(versions: latest main of mne, pipeline, and mne-bids)