nipy / nibabel

Python package to access a cacophony of neuro-imaging file formats
http://nipy.org/nibabel/
Other
649 stars 258 forks source link

NIfTI Extension redesign #1335

Closed effigies closed 1 week ago

effigies commented 3 months ago

Right now the definition of Nifti1Extension is (simplified):

class Nifti1Extension:
    def __init__(self, code, content):
        self.code = code
        self._content = self._unmangle(content)

    def get_content():
        return self._content

This means that the return type of get_content() depends on the subclass, violating the Liskov substitution principle and generally making it difficult to reason about the value of extension.get_content() without either inspecting the type of extension or extension.get_content().

In #1327 I've found that there are several places in our tests where we've assumed that get_content() returns bytes, so it is likely that downstream users will also be broken. This seems like an opportunity to redesign.

Instead we could consider following the lead of requests.Response/httpx.Response and provide a .content property that is always bytes, a .text property that is always str (raising an error if decoding fails) and a .json() method that decodes the JSON object, which can fail if the contents are not valid JSON.

class Nifti1Extension:
    @property
    def content(self) -> bytes:
        return self._content

    @property
    def text(self) -> str:
        return self._content.decode(getattr(self, 'encoding', 'utf-8'))

    def json(self, **kwargs) -> Any:
        return json.loads(self._content.decode(), **kwargs)

This could sit alongside the old, variable-type get_content().

I haven't fully thought through what we want to do for extensions that need to expose complex data types, like pydicom Datasets or Cifti2Headers. I'll experiment a little with this with typing to make sure that what I do fits in with typed Python. Probably something like:

class NiftiExtension(ty.Generic[T]):
    _object: T

    def _unmangle(self, contents: bytes) -> T:
        ...

    def _mangle(self, obj: T) -> bytes:
        ...

    def get_content(self) -> T:
        return self._unmangle(self._content)

    # Above stuff...

Interested in feedback. Will open a PR with some initial work once I have something that typechecks and tests.