Open Panaetius opened 3 years ago
I am having a similar issue when deserialising a nested object that only has the @id
and no associated @type
.
With data like this:
data = {'@id': 'http://example.com/id/abcd1234',
'@type': ['http://example.com/ont#Document'],
'http://example.com/ont#created_at': [
{'@type': 'http://www.w3.org/2001/XMLSchema#dateTime',
'@value': '2021-01-20T10:01:34.98892000'}],
'http://example.com/ont#created_by': [
{'@id': 'http://example.com/id/user%40example.com',
}],
}
The Document schema class
class DocumentSchema(JsonLDSchema):
class Meta:
rdf_type = EX.Document
model = Document
_id = fields.Id(default=generate_uuid_iri)
created_at = fields.DateTime(EX.created_at)
created_by = fields.Nested(
EX.created_by,
UserSchema,
many=False,
)
Then loading the example data:
d = DocumentSchema().load(data)
Gives this exception:
File "/Users/marty/smf/hivemind-justpy/venv/lib/python3.9/site-packages/calamus/fields.py", line 563, in _load
valid_data.append(self.load_single_entry(val, partial))
File "/Users/marty/smf/hivemind-justpy/venv/lib/python3.9/site-packages/calamus/fields.py", line 540, in load_single_entry
type_ = normalize_type(value["@type"])
KeyError: '@type'
The http://example.com/ont#created_by
entity is of type http://example.com/ont#User
but I do not receive that in the JSON-LD serialisation.
You might want to use the IRIField
( https://github.com/SwissDataScienceCenter/calamus/blob/master/calamus/fields.py#L193 ) for this.
This issue is more about dereferencing an @id
reference. E.g. if you have a Nested
reference in the Schema and the metadata looks like:
[
{
"@id": "http://example.com/mainobject",
"@type": "http://schema.org/something",
"nested_object": {"@id": "http://example.com/subobject"}
},
{
"@id": "http://example.com/subobject",
"@type": "http://schema.org/something_else",
},
]
Then it should include subobject
inside mainobject
when deserializing, by dereferencing the @id
. But only if subobject is also somewhere in the metadata.
This does currently already work in the case of flattened metadata (where no objects are nested and instead you just have a flat list), in which case you'd have to do DocumentSchema(flattened=True).load(data)
If we had some way of resolving @id
s, we could also use that to dereference something like http://example.com/id/user%40example.com
, e.g. if an HTTP GET request to the @id
returned jsonld for that object, or maybe if a user's Schema could provide a custom function to dereference it. As it is in your example, we really wouldn't have a way to create a user
object from a UserSchema
just by its @id
, unless it's also in data
, since we wouldn't know where to get the data for the fields from.
Currently when serializing with
fields.Nested
, we always serialize the whole object and add@type
information.In some cases, we want to instead serialize an @id reference instead of the whole nested object.
Theoretically this would be possible with something like
fields.Nested(schema.address, AddressSchema, only=("_id",))
, but that will result in something like{"@id": "...", "@type": "..."}
, whereas for an id reference we wouldn't want to have the@type
information included.We could add an addition flag to
fields.Nested
to skip adding the@type
but that's probably not a clean solution.A new field like
fields.IdReference(schema.address, AddressSchema)
would make sense that still takes the child schema likefields.Nested
but only returns the@id
of the child. On deserializing, it should look up the @id to see if it can get that object in the data and if not, raise an exception.