Open ashepherd opened 2 years ago
See the annotated version of the magic example. There are various things I'd suggest:
in @context -- need '/' at the end of the gsqtime stem
sameAs: is identifier for a publication, not for the data that the publication is based on. I'd suggest that schema:subjectOf or schema:citation are a better fit, but would recommend against schema:citation because it gets used for a recommended citation string in spite of the schema.org guidance. The citation to the related publication is in schema:citation, so that'll do for now.
provider organization, identifier is in schema:sameAs; why not use schema:identifier? that would be clearer
contributor should be a schema:Person
name should be a title for the dataset, not a citation for a publication based on the data description should describe the dataset, not a citation for publication based on the data
funding should have the FundingAgency in a funder property
putting the spatialCoverage in an array of points is technically correct, but I wonder how typical harvest clients are going to handle that? Perhaps including a box (maybe two boxes?-- looks like points are clustered in 2 areas) would be friendly for aggregators?
unitText 'custom' is not very useful for propertyValue 'Age'. Shouldn't is be Ma?
The https://github.com/earthcube/GeoCODES-Metadata/tree/doco-mergeECRR-GeoCodesDataset/metadata/Dataset repo directory has a collection of metadata examples that have been harvested from repositories to GeoCODES. There are the original harvest examples (...1.json), and updated versions (...1-2022-07-07.JSON) that validate with a JSON schema build based on the various examples and set up to harmonize schema.org doco for resources in the EarthCube Resource Registry with the dataset metadata the CDF repos have submitted for GeoCODES. This is recent work (still in a branch in our GeoCODES metadata repo) I haven't updated the SOSO group about yet, but would love to provide an update during the ESIP session. Unfortunately I won't be in Pittsburg in person...
@context - fixed
sameAs - The guidance doc states: sameAs - Other URLs that can be used to access the dataset page. A link to a page that provides more information about the same dataset, usually in a different repository.
Based on your description of sameAs it seems the docs need to be changed.
The MagIC URL element ("https://earthref.org/MagIC/doi/10.1029/2008GC002067") is another URL pointing to the dataset. It is based on the doi of the paper that describes the dataset and I thought it could be useful for people to include that, but it seems this usage may be incorrect. I will change upon further clarification if that is what we should do.
provider The guidance docs example used sameAs so I followed that guidance. I agree that "identifier" seems better so I have changed it.
contributor - fix in progress
name - fix in progress
description - We put the abstract of the paper here when available and a link to the paper when not available. We will discuss other options for this field.
funding - Our data model does not currently support more than two pieces of information related to funding. We can look into adding more in the future.
spatialCoverage - We find plotting just points for each site location in a dataset is most useful for our users. We will think about boxes.
unitText - Looking into fix.
Your comment -
"gstime:geologicTimeUnitAbbreviation": {
"@type": "xsd:string",
"value": "BP"
}
// seems like this is redunant, unnecessary
@smrgeoinfo I believe you asked for these abbreviations to be added and we did the work to do that. Do you remember what that was the case?
The documentation on schema:sameAs
does not draw attention to what I think is the key point, from owl:sameAs
.
If two resources are related by sameAs
then the properties of each apply to the other - the nodes can be merged in the graph. So if there are any properties associated with one node that really don't make sense associated with the other node, then don't say they are sameAs
.
Nick-- I misunderstood https://earthref.org/MagIC/doi/10.1029/2008GC002067 and resolved the doi part, but I see that it resolves to the dataset landing page, so I think you are correct that 'sameAs' is appropriate.
I don't recall the discussion around the time unit abbreviation; having the gstime:geologicTimeUnitAbbreviation property doesn't cause any problem, but it does seem that the UOM should be part of the TRS definition.
@ashepherd @fils We have updated the MagIC data headers to comply with the 1.3 guidance doc. This example has many of the elements described in the guidance doc: common properties, keywords, identifier, distributions, temporal coverage, spatial coverage, publisher/provider, funding, and license. Jarboe2008SchemaHeader.txt The website for this dataset containing the science-on-schema.org/JSAON-LD header is found on: https://www2.earthref.org/MagIC/19596
As @mbjones suggested, I thought this file could be used as one of the example files for the tutorial as it has many of the header elements described in the guidance doc.