Open alyssadai opened 10 months ago
We want to keep our issues up to date and active. This issue hasn't seen any activity in the last 75 days.
We have applied the _flag:stale
label to indicate that this issue should be reviewed again.
When you review, please reread the spec and then apply one of these three options:
flag:schedule
label to suggest moving this issue into the backlog nowsomeday
label to show that this won't be prioritized. The stalebot will ignore issues with this
label in the future. Use sparingly!
The list of missing datasets (in
open_neuro
graph but not in query tool result) are:What's the problem?
Based on inspecting these datasets in https://github.com/OpenNeuroDatasets, the problem is that the imaging data available in these datasets cannot be modeled by the CLI and thus the subjects have no (imaging) session info in the resulting JSONLD. Since the query template used in the API assumes that all subjects have at least one session (see https://github.com/neurobagel/api/blob/18d5d95ecf8ae6c2ee4b56cbe7a279ef684a8498/app/api/utility.py#L185-L188), the above datasets are never matched by any query sent using the API/query tool.
More details on the datasets
Here's a gist table https://gist.github.com/alyssadai/40c170e7f79117a276dc1586a1ebf344 with the missing dataset names, URLs, and specific observations on the BIDS data.
Of these, the only dataset where it's not immediately expected that the BIDS data wouldn't be able to be modeled by the CLI is
ds003082
. This one seems to have some problems with how session directories are named in addition to having some imaging files we don't support yet.More info
"hasSession": []
. This looks to be due to the fact that thebagel bids
command assigns an empty list by default to this attribute if there are no imaging sessions, https://github.com/neurobagel/bagel-cli/blob/4da00b6db4cce30d40f101c0c4e17be25db3828f/bagel/cli.py#L241-L244. However, the data model for annb:Subject
actually states that thehasSession
value is optional (https://github.com/neurobagel/bagel-cli/blob/4da00b6db4cce30d40f101c0c4e17be25db3828f/bagel/models.py#L61), meaning that we probably don't need to even assign a value if there are no sessions in the first placeNext steps
We should:
ds003082
)Originally posted by @alyssadai in https://github.com/neurobagel/planning/issues/54#issuecomment-1813750886