Closed drammock closed 1 month ago
I've also run into this issue... I remember that BIDS was considering this as problematic datasets but it's such a prevalent usecase...
Message ID: @.***>
@sappelhoff Do you know if and how BIDS supports data with missing sessions?
Reading the specs: https://bids-specification.readthedocs.io/en/stable/longitudinal-and-multi-site-studies.html https://bids-specification.readthedocs.io/en/stable/modality-agnostic-files.html#sessions-file
I don't see anything that would disallow missing sessions. But this use case isn't explicitly mentioned anywhere, either…
To the best of my knowledge, it is not disallowed. I also remember preparing a mixed design experiment once, where half of participants had a session that the other half did not have, and that led to a warning (to check whether this was intended), but not an error.
5. But we could do that check, but then store in a private config location the subjectXsession combos that are actually present, and use that info to get subjects/sessions to loop over, throughout the codebase.
Yep that would be my proposal. However we also have runs within sessions and then stuff gets really complicated, at least I remember seeing some rather convoluted code handling runs within subjects and sessions … maybe we should re-think all of this and create a clean, clear lookup-table early on in the pipeline run and then just refer to this downstream.
+1
it's great if it's just a warning. I have definitely seen this warning in the past.
Message ID: @.***>
To the best of my knowledge, it is not disallowed. I also remember preparing a mixed design experiment once, where half of participants had a session that the other half did not have, and that led to a warning (to check whether this was intended), but not an error.
was that warning from the BIDS validator, or from MNE-BIDS-Pipeline though? If the standard allows it, IMO the pipeline shouldn't error out.
It was a warning from the BIDS validator.
If the standard allows it, IMO the pipeline shouldn't error out.
agreed
maybe we should re-think all of this and create a clean, clear lookup-table early on in the pipeline run and then just refer to this downstream.
Sure this would work. Or (equivalently) something like get_subjects_sessions
that returns list[tuple[str, str]]
of the subject-session pairs on the fly just like our get_runs_tasks
that returns a list[tuple[str, str]]
of the run-task pairs. This would be a bit of an intermediate solution but I think it would work?
And I think it would be good to have a config.allow_missing_sessions = False
(by default) that you would have to set to True
to allow it to just skip missing sessions when not present. That way people won't get any unpleasant surprises thinking they're analyzing a complete dataset when really they're missing some stuff.
another "I have a longitudinal study" question here. I have some subjects with missing timepoints (AKA missing sessions). I want the pipeline to look at the config file (which says
subjects = "all"
andsessions = ["a", "b", "c"]
) and then look at the bids data tree and process all the subjectXsession combinations that are present, without crashing because subject 101 is missing session "b". In practice, such errors take the form of:which is confusing; it errors when looking for a split partial file, when the real problem is that there isn't even a
.../sub-101/ses-b/
directory.Proposed solutions
just make the error message better, and do something complicated in the config where I figure out which subject-session combos I have, and run the pipeline multiple times (e.g. once for each session, with different lists of subjects each time)
blithely add an
allow_missing=True
here, and hope that it doesn't have catastrophic downstream effects:https://github.com/mne-tools/mne-bids-pipeline/blob/61f29616f5e3c2e246365f9568ae2c168616617c/mne_bids_pipeline/steps/init/_02_find_empty_room.py#L45
it seems like there are a lot of places where we do
for subject in subjects
andfor session in sessions
wheresubjects
andsessions
are taken straight from the config file each time. So it won't work to do a simplePath(...).exists()
type of check to see if the entire session folder is missing, right about here:https://github.com/mne-tools/mne-bids-pipeline/blob/61f29616f5e3c2e246365f9568ae2c168616617c/mne_bids_pipeline/steps/init/_01_init_derivatives_dir.py#L79-L81
But we could do that check, but then store in a private config location the subjectXsession combos that are actually present, and use that info to get subjects/sessions to loop over, throughout the codebase.
Open to other ideas!