neurobagel / bagel-cli

Command line tool for Neurobagel data parsing and annotation
https://neurobagel.org/cli/
MIT License
2 stars 5 forks source link

[ENH] Update `example_synthetic.json` to be a comprehensive reference example #144

Closed alyssadai closed 1 year ago

alyssadai commented 1 year ago

Steps to implement

alyssadai commented 1 year ago

@surchs, since the example_synthetic.tsv also needs to be updated alongside the data dict to have comprehensive examples of missing values (for age, sex, and diagnosis), and this tsv technically corresponds to the bids-examples synthetic dataset (https://github.com/bids-standard/bids-examples/tree/d8455af1def5e7401f212c0c7c98524776035005/synthetic), do you agree that it's fine if we deviate from the age/sex values in the original participants.tsv?

The bagel bids command currently doesn't check for alignment with the participants.tsv in a given BIDS dataset at all, I don't think (so the change shouldn't break anything), but just thought I'd double check!

surchs commented 1 year ago

do you agree that it's fine if we deviate from the age/sex values in the original participants.tsv

Yes! And we have already done so. our participants.tsv for the synthetic dataset has different values (e.g. for age) and additional columns already.

The bagel bids command currently doesn't check for alignment with the participants.tsv in a given BIDS dataset at all

That's on purpose, at least to me. For example, in mr_proc all the relevant phenotypc information will be in the bagel.tsv file that may contain many more subjects than the corresponding BIDS-participants.tsv file. That's also why the user needs to tell us which file to parse, we won't just assume it's the participants.tsv file.

Thanks for checking! important questions

Remi-Gau commented 1 year ago

That's on purpose, at least to me. For example, in mr_proc all the relevant phenotypc information will be in the bagel.tsv file that may contain many more subjects than the corresponding BIDS-participants.tsv file. That's also why the user needs to tell us which file to parse, we won't just assume it's the participants.tsv file.

Note that you may wanna capture that somewhere in the user facing doc.

(not the fact about mr_proc) but that the cli tool is more flexible for its input than what "vanilla" BIDS would imply.