neurobagel / bagel-cli

Command line tool for Neurobagel data parsing and annotation
https://neurobagel.org/cli/
MIT License
2 stars 5 forks source link

Support multiple participant ID columns in tabular data #286

Open alyssadai opened 6 months ago

alyssadai commented 6 months ago

We keep encountering TSVs that have a participant ID column without the sub- prefix required for BIDS subdirectories. This makes sense, as most likely the phenotypic/tabular data will have preceded the BIDS conversion step for most datasets.

Right now, the only option for these kinds of tabular data to work with the Neurobagel CLI is if the data owner adds another column for the BIDS participant IDs, or modifies their existing participant ID column. This is probably not ideal, and also means that we are sometimes creating extra work for data owners only if they have imaging data in addition to phenotypic data.

We should consider having the CLI handle this work by automatically prepending sub- onto a given participant ID column if the prefixes don't already exist, before doing the imaging-tabular subject matching check.

alyssadai commented 6 months ago

We need to think about the scenario where some subjects in the tabular file have BIDS data, but not all.

github-actions[bot] commented 3 months ago

We want to keep our issues up to date and active. This issue hasn't seen any activity in the last 75 days. We have applied the _flag:stale label to indicate that this issue should be reviewed again. When you review, please reread the spec and then apply one of these three options: