Open effigies opened 1 year ago
Is the .get()
method being eliminated? Not sure if this is an oversight, but if so, it seems a shame, as it's by far the method I most use and get_files
adds an extra 6 characters of typing.
Dropping get()
was intentional because it currently is an omnibus interface that retrieves files objects, file paths, entity keys, entity values, metadata keys, metadata values, subject dirs and session dirs. It seems much cleaner to separate out the current uses that we want to keep into separate functions. We could keep one of them as get()
, and I think it would have to be what we currently have as get_files()
, as that was the default return type for the current .get()
. Anything else would likely lead to silent failures.
Question for the group: Do we want to architect this as abstract base classes, to facilitate reimplementation, or keep everything concrete for now?
From a typing perspective, the most flexible (and Pythonic?) would probably be a Protocol
, which would allow you to sub out the implementation without requiring the implementation to explicitly be a subclass of the pybids BIDSLayout
. So my preference would be Protocol
> ABC > Concrete implementation.
Okay. I'll try coding it up as a collection of protocols. I've only used them as a hack before, I'm curious to try this.
Realizing I should have tagged @erdalkaraca and @ghisvail on this. I suspect you will have opinions.
Often I am wanting to query and compare at the conceptual level of images instead of having to deal with particular files and their extensions.
To this end, something I would find useful is a BIDSImage
class which abstracts away file-level information and lets me deal with all the information pertaining to a particular image - specifically, combining .nii
images with their .json
sidecar and .tsv
timing files, although I could see cases where it would be nice to join more information. For example, joining .nii
with its corresponding confounds in the fmriprep derivative dataset.
Rough example of what I'm thinking:
class BIDSImage():
# least common denominator of source entities
entities: dict[Entity, Value | Index]
# suffix of the root image
suffix: str | None
def __init__(root_image: BIDSFile, related_files: BIDSFile): pass
# Return path to the root image file
def get_root_file() -> Path: pass
# Return list of all files this class was derived from
def get_files() -> list[Path]: pass
# Shared metadata from all files aggregated
def metadata(self) -> dict[str, Any]: pass
Thanks for the feedback!
I'm not sure that we want this additional abstraction in pybids; more new classes is more cognitive overhead for downstream users.
But making sure you have the tools to quickly build it yourself if it is useful to you is in scope. entities
, suffix
and get_root_file()
come for free from BIDSFile
, and possibly metadata, though I'm not sure what "shared metadata from all files aggregated" means. As to get_files()
, this seems like the list of inputs passed to __init__
, so if we make sure you can do that, we should be good.
Context: Over the last year or so, we've looked into replacing the SQLAlchemy-based
BIDSLayout
with one based on ANCP-BIDS. In the process, we've found issues in the current API where we are forced to decide whether it's worth shimming the old behavior in to avoid breaking things for users (if there are any).Ultimately, we've decided that the API is too large and has features that seem to have more maintenance burden than value to users. So we want to propose a new API.
We would like this API to be:
1) Small. Understanding it should take ~10 minutes. 2) Reimplementable. If you have a better implementation, it should be cheap for people to switch to yours. 3) Schema-based. The BIDS standard now has a schema; it should be possible to swap in an updated schema to support unmerged BEPs.
PyBIDS 1.0 layout API
Base types (see PaddedInt):
Data structures:
It can be useful to group multiple datasets into a single logical layout, for example, when querying raw data and derivatives together. Therefore we also propose an explicit collection of layouts, such that each method/property is implemented by aggregating over all of the collected layouts.
layout.utils
A module will be provided to provide utilities that only rely on the
BIDSLayout
API. The goal is to provide implementations of common tasks that will survive refactors and serve as exemplar code for working with the API.Some (potentially layout-free) parsing/creation seems useful:
Please let us know if there are things you need to do that cannot be done with this API. We recognize that this will break things, but if there are features you use that require more than 10-20 lines of code to reimplement without access to
BIDSLayout
internals, please let us know.