Closed gothub closed 7 years ago
In addition, the rdf/xml subprocessor requires that each pid in the prov:wasDerivedFrom
relationship have a dcterms:identifier
relationship, for example:
<rdf:Description rdf:about="https://cn-dev-2.test.dataone.org/cn/v2/resolve/urn%3Auuid%3A290cf274-aca6-4f8b-b980-0e0f8cbf9769">
<dcterms:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#string">urn:uuid:290cf274-aca6-4f8b-b980-0e0f8cbf9769</dcterms:identifier>
</rdf:Description>
This type of relationship is needed for both the subject and the object of the 'wasDerivedFrom` triple
This issue is resolved by requiring that pids entered as sources
or derivations
to insertDerivation
are DataONE PIDS, without any preceeding resolve
or object
service URL.
If this is the case, then it is trivial to determine which PIDs are for package members and which
are for external PIDs (from other packages), thus making it easy to properly URL encode URLs in the resource map that these PIDs.
When an RDF resource map is serialized from a DataPackage, any relationship that has a package member id as the subject or the object has those ids 'promoted' to a DataONE PIDs, as resolvable URLs, for example:
urn:uuid:8839c67d-e292-46ef-adff-a646158fa023
is promoted tohttps://cn-dev-2.test.dataone.org/cn/v2/resolve/urn:uuid:8839c67d-e292-46ef-adff-a646158fa023
However, pids are not URL encoded for relationships that are added via
insertDerivation
, for exampleinsertDerivation(x, source=x, derivation=y)
which inserts ay prov:wasDerivedFrom x
relationship into the DataPackage, wherex
andy
are DataONE resolvable URLs, i.e.x=http://mn-dev-ucsb-1.test.dataone.org/metacat/d1/mn/v2/object/urn:uuid:da641293-ee21-4ffa-aac3-6a958a2add3e%22
It's not clear how to identify relationships in a DataPackage that are DataONE PIDs that are not in the package, that are not URL encoded, as DataONE pids can have many different formats. The PIDs could be URLencoded before calling
insertRelationship()
but that is a bit of a burden for the user.