Closed SophieHerbst closed 3 months ago
The group averaging should be re-run, CSP should not, yes
So there's a bug somewhere
@hoechenberger can you give me a hint on where to look for the config parameters that are checked for changes (just tell me which file)
@SophieHerbst there is some magic that happens with
To make calls like this magically get checked
Part of the magic is that this failsafe_run
checks the out_files
returned by the decorated function, hashes them, and adds them to the set of things hashed. But if there are no output files, I suspect:
Assuming this is the problem, we could somehow try to invalidate the cache of call A
or something, but that would be hard to get right. An easier approach would be to make sure that every step produces a non-empty out_files
(keeping in mind that the report is never in that set of files). For something like average_evokeds
this already happens, but perhaps for the decoding step it does not. If we simply write out an extra .csv
-- or even a trivial hidden file ._99_group_average_run
with something that will change based on the cfg
parameters (like the parameters used for clustering or something) then the mtime-or-hash check on this file will cause a cache miss for the failsafe_run
the second time you run with parameter set A.
@SophieHerbst do you want to try fixing this or prefer I give it a shot? Not sure how involved it will be -- might just be one step that needs fixing, or several. (And I might be wrong about this explanation!)
... and for this explanation note that the subject reports are (rightly) never considered in the set of out_files
to hash, since the reports are updated almost every step.
Hi @larsoner, thanks for the helpful explanation. I should be in your case 2 as I did change the parameters cluster_forming_t_threshold and cluster_permutation_p_threshold.
- Calling with parameter set A will cause a re-run
- Calling with parameter set B (!=A) will cause a re-run
- Calling with parameter set A a second time will not cause a re-run, because the output of the callable will be unchanged
If you can give it a try, I think it would be more efficient, as this whole magic business is still somewhat obscure to me.
@larsoner @hoechenberger the rerun-magic is killing me.. (when trying to change things in the report) is there a way to force complete rerun for a certain step, e.g. sensor/99_group_average?
Yes in principle --no-cache
plus --steps
specifying that one step should do it for you:
$ mne_bids_pipeline --help
usage: mne_bids_pipeline [-h] [--version] [--config FILE] [--create-config FILE] [--steps STEPS] [--root-dir ROOT_DIR] [--
...
--steps STEPS The processing steps to run. Can either be one of the processing groups 'preprocessing', sensor',
...
--no-cache Disable caching of intermediate results.
There really is a bug with the csp code, even completely changing the parameters does not trigger a rerun. Only deleting the output xlsx files does.
Agreed I think it's outlined in https://github.com/mne-tools/mne-bids-pipeline/issues/901#issuecomment-2015147155 , I hope to fix it soon
I expected that a change in these parameters would trigger the re-computation of the group statistics , but it did not. sensor/_05_decoding_csp is also not rerun (but I didn't expect it to).