Add a JSON-LD context file

cmungall commented 8 years ago

Although the likely schema-level specification will be JSON-Schema (#31) this will live alongside a JSON-LD context that will specify the complete semantics of the format, and will be used to convert between RDF and JSON

cmungall commented 8 years ago

Note also that this ticket is crucial for our identifier strategy

cmungall commented 8 years ago

One possibility is to embed the jsonld directly in the reference model, see https://github.com/io-informatics/jackson-jsonld

cmungall commented 8 years ago

The above module is turning out to be difficult to use; let's go manual for now

cmungall commented 8 years ago

Here's a bit more guidance to get us started.

This would be a good starting point: https://github.com/monarch-initiative/monarch-app/blob/master/conf/monarch-context.jsonld

Add additional mappings for any keywords found in any of the examples

See the @jsonld annotations in the java source for additional hints

See the makefile in https://github.com/OBOFoundry/OBOFoundry.github.io/ and the util directory for how to generate RDF from the JSON

mbrush commented 8 years ago

Started looking through these various pieces of documentation and code. Some thoughts here to clarify the scope and goals for the phenopackets context file. I see three types of mappings it must address:

1. Expanding term prefixes for identifiers/IRIs (semantic web schema, ontologies, databases, other identifier sources)

The existing monarch-context.json should suffice here, enhanced with any additional terminologies/sources used in phenopackets

2. Expanding selected values to ontology class IRIs (e.g. evidence codes such as "TAS" to ECO IRIs such as ECO:0000304).

For evidence types, how will users/systems be prompted to enter this info? Will they be told to use an ECO code? I see that in the current schema spec, the property for capturing the 'type' of the Evidence references the OntologyClass java class in the reference implementation, which has ID and label properties. So perhaps an ID will be provided?
Besides ECO codes, are there other such 'tokens' that phenopackets will accept as values and would that need to be expanded to IRIs in the context file?

3. Expand all json keys/properties from to ontology property IRIs (needed to support full translation to RDF)

This will require most effort - what is time frame for wanting an initial mapping done?
Would a good approach here be to use the java class definitions in the reference implementation to find all properties that need mapping, and use the json schema and example files to understand how these properties are used in the data. Then find (or create) ontology properties to map to these json properties/keys.
I see that the java files already include a few mappings in @JsonProperty statements, e.g. has_location to "http://purl.obolibrary.org/obo/BFO_0000066"

cmungall commented 8 years ago

I think this is a good analysis, thanks @mbrush -- the nice thing about jsonld is that it doesn't have to be complete, so we can take an incremental appraoch here

jmcmurry commented 8 years ago

Ultimately, we need more context coordination through prefixcommons but we are not there yet. In the meantime, I'm reluctant to create yet another thing to curate. Is there a strong reason not to just repurpose the monarch context? Rename to something more generic perhaps?

cmungall commented 8 years ago

Don't really care about the name. Coordination of the prefixes is mildly annoying but we have scripts to do that in the prefixcommons repo. But the prefixes aren't the important thing here, this is the specification of the semantics and what drives the rdf<->json translation, pxf being json-ld is vital part of the specification

On 22 Mar 2016, at 16:11, Julie McMurry wrote:

Ultimately, we need more context coordination through prefixcommons but we are not there yet. In the meantime, I'm reluctant to create yet another thing to curate. Is there a strong reason not to just repurpose the monarch context? Rename to something more generic perhaps?

You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/phenopackets/phenopacket-format/issues/40#issuecomment-200073468

mbrush commented 8 years ago

@cmungall what is timeline for needing this? I'm assuming it is not needed for the publication?

balhoff commented 8 years ago

I'm now looking into implementing some ontology aware services within pxftools. I'm realizing that the context will be required to do anything with the IDs used in a phenopacket file. How is this coming along? Am I correct also that the reference API doesn't yet implement reading a within-file context?

balhoff commented 8 years ago

There is a wrinkle with expanding unprefixed values to ontology class IRIs (#2 in the list above). JSON-LD has two different scenarios for identifier expansion: 1) properties and types, and 2) resource values. See this example in JSON-LD Playground: http://tinyurl.com/he2gh3u

In one scenario:

"evidence": {
        "types": [
          {
            "id": "TAS"
          }
        ]
      }

TAS is expanded to http://example.org/base/TAS.

In another usage (which isn't a normal PhenoPacket structure):

 "evidence": {
    "@type": "TAS"
  }

TAS is expanded to http://purl.obolibrary.org/obo/ECO_0000304. (See the N-Quads panel at the bottom).

So far I can't figure out a way to use @type where we would need it, and also provide a label at that position.

balhoff commented 8 years ago

An alternative would be to alias types directly to @type. The value would have to be either a string or list of strings. We could recommend that all ontology terms which people want to use as labels be declared in the context. This would preclude using the same label for two different terms in the same document. Here is an example: http://tinyurl.com/jctnx2e

Negative side: the label information would not really be available as part of the data model. I think this would be bad for tools.

balhoff commented 8 years ago

As far as I can tell it is not possible to use JSON-LD framing to produce a types list with values with id and label. Framing wants to output classes via @type.

balhoff commented 7 years ago

Relating to the problem that the context can't replace unprefixed field values: https://github.com/json-ld/json-ld.org/issues/428

balhoff commented 7 years ago

I recently discovered how unprefixed items in value position can be converted to IRIs within JSON-LD, so I think we can actually support evidence code values like "TAS". You need to use @vocab as the value type in the context entry for the property:

{
  "@context": {
    "@base": "http://example.org/data/",
    "evidence": {
      "@id": "http://phenopackets.org/has_evidence",
      "@type": "@vocab"
     },
    "TAS": "http://purl.obolibrary.org/obo/ECO_0000304"
  },
  "@id": "patient1",
  "evidence": "TAS"
}

That JSON would produce this RDF:

<http://example.org/data/patient1> <http://phenopackets.org/has_evidence> <http://purl.obolibrary.org/obo/ECO_0000304> .

phenopackets / phenopacket-format

Add a JSON-LD context file #40