Open BSVogler opened 2 years ago
The current implementation of __eq__
just checks resources by their reference e.g. Patient/id1
. __hash__
should work consistently with __eq__
, otherwise if might lead to mistakes.
If you iterate on versions then it can be that Patient/id1/_history/1 != Patient/id1/_history/2
. So I think the eq need a little bit more complexity by taking the history into account.
How about using __hash__()
inside the __eq__
implementation?
eq implementation discussion here as well https://github.com/beda-software/fhir-py/issues/51
I agree, it makes sense.
The problem is, should we use versionId
for a mutable resource __hash__
updating it would violate hash invariant (as the versionId
would change and so would the hash but the containing data structure wouldn't be aware of the change and that might result in all sorts of unexpected behaviour).
@BSVogler could you please give an example of a use case where having unversioned __hash__
might be an issue?
It seems logical to add versionId
to __eq__
, but I have a concern because not all FHIR servers implement versionId
. Sometimes they don't return it at all, sometimes it 1
every time.
So for example the same code that worked with a server which not returns versionId
can break after the server upgrade. What do you think @BSVogler ? We can discuss implementing some helper methods in resources to explicitly use it for comparing them.
Regarding __hash__
, @m0rl raised a valid concern, from documentation:
If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
Let's implement __hash__
as resourceType + id
. It will be immutable and consistent.
I don't see any sense to have two resources with the same id but different versionId
in the set.
@BSVogler
The library architecture allows to redefine the resource class
https://github.com/beda-software/fhir-py/blob/a14a36f93b0462fc046c460a3a617a93daef0487/fhirpy/lib.py#L94
https://github.com/beda-software/fhir-py/blob/a14a36f93b0462fc046c460a3a617a93daef0487/fhirpy/lib.py#L79
So you will be able to implement desired behavior when resourceType, id, and versionId are used for equality.
We will add an example to the docs.
Overwriting the class would be good although I would prefer to have it easier to configure. The _history is part of the FHIR standard. That begs the question whether fhirpy is intended to work against a full capabilities server or a limited implementation by default?
When the server supports history just using resourceType + id will cause a lot of confusion. You can update a resource and it can be completely different and then using the equality operator will say that they are equal. (that is also what ruscoder said). The hash implementation could also use a real hash of the json. That would be even more strict.
In order to use resources in a set they should implement the
__hash__()
method. A resource is unique by the id + history version id, so putting this together would be the fastest way.