Closed SophieHerbst closed 6 months ago
In the new complete run of the pipeline, I started using find_flat_channels_meg = True find_noisy_channels_meg = True Do these modify the empty room information in a later step, which triggers the re-run?
Without looking at the documentation or code, I would say yes, because this can change the information about which channels are to be marked as bad before running Maxwell-filter… and we try to keep the bad channels in sync between experimental runs and empty-room recordings
Hm ok. So no way to prevent the lengthy recomputation?
Ah wait. You first finished a complete pipeline run, then adjusted the ECG threshold, and when you re-run now, some earlier step is being re-run? Which one is that, preprocessing/_01_data_quality
? That should not happen, no. And it only appears for the empty-room recording??
yep, it happens in preprocessing/_01_data_quality
and only for empty room, yes also, it happens only the first time, when I re-rerun it, it does not happen anymore
this shouldn't happen… I don't have time to reproduce or look into this now, though, sorry
No problem, I just wait for it to be finished once, I wouldn't want to use the development version anyways for this project. But it would be good to fix it in the future.
Can you upload one subject's raw bids
data plus your config.py
? I can look
@SophieHerbst given this is an issue with the empty-room data can you upload sub-emptyroom/ses-19230318
(not the derivatives
one but the bids_root
/ original one)?
Okay I think I see how this can happen. If two subjects A and B match to the same empty room recording you can run the bad channel finding for that file twice, first for A then for B (assuming n_jobs=1). Then when you re-run the pipeline, a problem will be detected with the output file modified time, because both A and B will have written e.g. :
$ ls -l ~/mne_data/derivatives/mne-bids-pipeline/ds000117/sub-emptyroom/ses-20090409/meg/
total 176
-rw-rw-r-- 1 larsoner larsoner 12 Mar 14 15:15 sub-emptyroom_ses-20090409_task-noise_bads.tsv
-rw-rw-r-- 1 larsoner larsoner 174558 Mar 14 15:15 sub-emptyroom_ses-20090409_task-noise_scores.json
Although it will cause redundant calculations, the cleanest solution here is probably to save the _bads.tsv
in subject A and B's derivatives folders separately. This is what ends up happening in the maxwell filter step anyway, since it can use different sets of bads for the two subjects.
sorry, I was completely offline for some days. will try the fixes now!
I just finished a complete pipeline run (1.6), and now wanted to improve ica_cleaning. The only parameter I changed is ica_ctps_ecg_threshold, so I did not expect any steps before that being rerun, but I receive:
│11:31:44│ 🚫 sub-215 run-noise Output file hash mismatch for /neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized/derivatives/sub-emptyroom/ses-19230318/meg/sub-emptyroom_ses-19230318_task-noise_scores.json, will recompute …
This takes a lot more time and happens for every participant.
I never observed this behavior before. In the new complete run of the pipeline, I started using find_flat_channels_meg = True find_noisy_channels_meg = True Do these modify the empty room information in a later step, which triggers the re-run?
Happy about any insights on whether it is possible to avoid the re-run.