datalad / datalad-neuroimaging

DataLad extension for neuroimaging research
http://datalad.org
Other
17 stars 14 forks source link

NF: decouple BIDS dataset from DataLad dataset with proper UX #126

Open jsheunis opened 1 year ago

jsheunis commented 1 year ago
  • as you mentioned DataLad dataset can contain multiple BIDS datasets -- we would need to stick nested datasets metadata description somewhere too ideally, and then traverse files there corner case -- top level of the DataLad dataset is not BIDS dataset (e.g. consider YODA style results but without dedicated subdataset for rawdata/), and then BIDS dataset nested within e.g. rawdata/

Exactly. Ideally an extractor would be able to figure out what and where to extract automatically, but I think with the combination of (1) datalad dataset nesting and (2) BIDS allowing flexibility in where the dataset directory is located, we cannot leave it up to the extractor to decide. I think this would need some extra user input via extraction parameters.

  • and BIDS dataset can be represented by multiple DataLad datasets (e.g. per subject etc) - dataset level metadata is kinda easy, but then where do we stick per file metadata -- into superdataset or individual ones (but those cannot even extract it on their own, rely on hierarchy)

Good point, I didn't consider this before. What could perhaps be useful here is to look at the updated genericjson_file extractor and see if that can be inherited from for the BIDS file-level extractor (see issue: https://github.com/datalad/datalad-neuroimaging/issues/120). This allows the user to specify a sidecar file pattern as the source for metadata to be extracted. So combining this with an extractor command argument instructing it to traverse into subdatasets could perhaps be a good direction to investigate.

Originally posted by @jsheunis in https://github.com/datalad/datalad-neuroimaging/issues/123#issuecomment-1490337763