earthcubearchitecture-project418 / p418Vocabulary

Vocabulary + HTML for describing the schema.org extension
https://geodex.org/voc/
6 stars 4 forks source link

How to link datasets to physical samples #16

Open ashepherd opened 6 years ago

ashepherd commented 6 years ago

@fils, @smrgeoinfo

{
  "@type": "Dataset",
  ...
  "hasPart": [
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... first sample...
    },
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... second sample ...
    },
  ]
}
smrgeoinfo commented 6 years ago

Given the limited scope of schema.org, using 'hasPart' seems like the least unlikely relation to use for now as a proof of concept.

In the long run, we really need to think about the purpose of this schema.org markup. There are already lots of better defined metadata vocabularies and standards out there that are in wide use for scientific data. What are we gaining by ad hoc use of relationships like this to get information we're interested in into the SDO markup, when there are already mechanisms to publish the metadata in xml or using other rdf vocabularies (Prov, geoDCAT-AP) that are designed for the information we're interested in. If the commercial search engines are interested in data, wouldn't in make more sense for them to figure out how to index existing metadata?

In the meantime, I'll go ahead and implement 'hasPart' for linking the Earthchem Library datasets to IGSNs where the information exists.

extended example with IGSN:

"hasPart": [
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... first sample...
      "identifier": {
        "@type": "PropertyValue",
        "additionalType": ["http://schema.geolink.org/1.0/base/main#Identifier", "http://purl.org/spar/datacite/Identifier"],
        "name": "IGSN goes here",
        "propertyID": "IGSN",
        "url": "https://app.geosamples.org/sample/igsn/WHO000A52",
        "value": "WHO000A52"
      },
      ...
    },
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... second sample ...
      "identifier": {
        "@type": "PropertyValue",
        "additionalType": ["http://schema.geolink.org/1.0/base/main#Identifier", "http://purl.org/spar/datacite/Identifier"],
        "name": "IGSN goes here",
        "propertyID": "IGSN",
        "url": "https://app.geosamples.org/sample/igsn/WHO000A53",
        "value": "WHO000A53"
      }
      ...
    }
  ]
mbjones commented 6 years ago

@smrgeoinfo totally agree on needing to decide what SDO is "for". I think domain metadata standards are too niche for the big search engines to grapple with. But their happy to deal with something the size of Wikipedia, and I think more happy if we spend the time mapping domain info onto their chosen model. That it isn't as precise as the domain metadata isn't probably their biggest concern. The one nice thing about everyone mapping to SDO is we seem to be achieving pretty broad consensus on vocabularies, just because everyone wants to be compatible with the search engines, although we are losing precision along the way.