Open mbjones opened 3 years ago
As for other solutions, I wonder how well it'd work if, when we encounter a reference to a more detailed record (via a schema:subjectOf
triple with a suitable schema:encodingFormat
for our systems), we just harvest and use that as the primary metadata record for dataset/DataPackage.
If we did want to hang on to the original JSON-LD and any other alternate formats we didn't havest, an appropriate place might be in the ORE using rdfs:seeAlso
or ore:similarTo
(See Section 4.4 in the ORE Spec).
When we harvest a data package from schema.org, we create a canonical copy of the schema.org JSON-LD, and index that. If the SO entry contains a link to a more detailed metadata record as proposed int he SOSO guidelines, then we should also index that content. To do so means we need to resolve conflicts and issues of precedence (e.g., if the two metadata sources provide different titles), and determine how to merge them into a single package so they do not show up in the index as distinct data packages. This could involve creating an ORE and having both metadata docs be a member of the package, or other solutions.
Dave and I had a slack conversation on this, some of which is included below for context.