Open CPernet opened 2 months ago
Not exactly no. In the past I have used the bids-dataset extractor from datalad-neuroimaging
, together with datalad-metalad
, to extract dataset-level information from a BIDS-compliant dataset.
When it comes to file-level metadata, I have used datalad-metalad
and the metalad_core
extractor on the file level to get an array of file-specific metadata objects, together with the meta-conduct command to do this for all files in a datalad dataset.
Also, there would be essentially no difference between getting file-level metadata for a BIDS dataset vs any other collection of files.
More recently, we have developed a bunch of iterators in datalad-next
as well as custom scripts that serve as helpers in such scenarios as you have. Here's an example script: https://github.com/abcd-j/data-catalog/blob/filelist/code/create_tabby_filelist.py
I suspect that that script can do what you want it to do with just a few changes, since you need the output format per file of:
{
"type": "file",
"dataset_id": "1234",
"dataset_version": "abcd",
"path": "file/path/relative/to/dataset/root",
"url": "download-url-of-file-if-available",
"contentbytesize": "size-of-file-if-available",
"metadata_sources": {
"sources": [
{
"source_name": "custom-source-name",
"source_version": "custom-source-version"
}
]
}
}
nice! I'll get onto that next, then - mapping all the files from the phantom dataset 🙏 every command runs smoothly, returns good error messages .. just me slow to catch up
@jsheunis is there a magic datalad tool to map files from a BIDS dir to the json file schema? allowing to 'automatically' (or almost) add all the json lines to the dataset one