childmindresearch / bids2table

Efficiently index large-scale BIDS neuroimaging datasets and derivatives
https://childmindresearch.github.io/bids2table/
MIT License
13 stars 5 forks source link

Make extracting sidecar metadata optional #33

Closed clane9 closed 3 months ago

clane9 commented 3 months ago

In datasets with large sidecar json metadata, extracting metadata can take up >90% of run time. Add an option to ignore metadata when it's not needed to get a significant speedup. The 'meta__json' column remains in the table but the values are null.

effigies commented 3 months ago

Would it be possible to load the metadata afterwards? Consider the case where you need metadata, but only on a few files. You could filter the table first, and then load.

clane9 commented 3 months ago

Hey @effigies, thanks for the comment, that is a good point. It's definitely pretty easy to do after the fact by applying extract_metadata to the file paths in the filtered table. I just added a helper method in BIDSTable that does this.