An alternative approach could be to use functools.singledispatch:
@functools.singledispatch
def hash_obj(obj: object) -> bytes:
# Works for generic objects with __dict__
dict_rep = ":".join(":".join(key, hash_obj(val)) for key, val in obj.__dict__.items())
return sha256(f"{obj.__class__}:{dict_rep}".encode()).hexdigest()
This defines a cryptographic hash for a generic object that applies recursively. We would need some bottom types that don't have __dict__:
And each type would be able to declare how much is needed to uniquely identify it across instances. We could add set() and frozenset() to ensure that these known-problematic builtin types are consistent. And then provide a means for a downstream tool to register a type with our hasher, such as:
Right now we have hashing split up in a few places:
https://github.com/nipype/pydra/blob/b5fe4c0eb7f937e70db15bc087d86fe90f401ff3/pydra/engine/helpers.py#L677-L708
https://github.com/nipype/pydra/blob/b5fe4c0eb7f937e70db15bc087d86fe90f401ff3/pydra/engine/helpers.py#L672-L674
https://github.com/nipype/pydra/blob/b5fe4c0eb7f937e70db15bc087d86fe90f401ff3/pydra/engine/helpers_file.py#L70-L168
An alternative approach could be to use
functools.singledispatch
:This defines a cryptographic hash for a generic object that applies recursively. We would need some bottom types that don't have
__dict__
:And each type would be able to declare how much is needed to uniquely identify it across instances. We could add
set()
andfrozenset()
to ensure that these known-problematic builtin types are consistent. And then provide a means for a downstream tool to register a type with our hasher, such as:or