Open simleo opened 2 years ago
We discussed ordering for multiple-value properties at yesterday's RO-Crate meeting.
@list
. Sometimes order matters, e.g., authors in a Workflow RO-Crate.We should therefore switch to sets for property values
This is harder than it looks, since Entity
uses the underlying JSON dictionary (self._jsonld
) for storage (__getitem__
/ __setitem__
perform conversions as needed when the value of a property is requested).
I.e., the JSON-LD is not properly flattened. Note that, while in the above example the API user can easily avoid generating the duplicate, in the general case it may be much trickier to even notice that one is being generated (e.g., subsequent calls to
Entity.append_to
in different sections of the code).This should be dealt with in "real time", so that the crate stays flattened at all times and assertions like
len(crate.root_dataset["author"]) == 2
don't fail while one is still working on it. Since lookup by value in a list is O(n), extending a property with subsequent calls toappend_to
would become quadratic. We should therefore switch to sets for property values, which is also closer to their actual semantics, since they have no predefined order. Should we then add support for JSON-LD lists? Are they supported / do they make sense in Schema.org / RO-Crate?