neurobagel / bagel-cli

Command line tool for Neurobagel data parsing and annotation
https://neurobagel.org/cli/
MIT License
2 stars 5 forks source link

unhandled error: when participant_id column has missing values, bagel pheno crashes #91

Closed surchs closed 1 year ago

surchs commented 1 year ago

If the input pheno.tsv file has missing values / empty strings / nan in one of the participant_id columns, things break in unexpected and unhandled ways.

Something like this: participant_id other_val some_val
sub-01 3 10
4 11
sub-03 5 6

This shouldn't happen if we're handling a regular BIDS participants.tsv file, but it probably does happen. And it will certainly happen once we handle multiple participant_id columns, not all of which will be having values.

We have to decide what to do about this, because at the moment we are iterating over the unique elements in the participant_id column, and nan would be one of these.

My view is that we should have some warnings for missing values in any column as part of a phenotypic validation step (see also #90), and then we should raise some exception if the participant column has a missing value / nan.

While we're at it, we might as well just do this for session ids as well. So tasks are:

github-actions[bot] commented 1 year ago

We want to keep our issues up to date and active. This issue hasn't seen any activity in the last 30 days. We have applied the stale-issue label to indicate that this issue should be reviewed again and then either prioritized or closed.