Open julesjacobsen opened 3 years ago
@julesjacobsen Thanks for the summary, which captures the core of it!
There is an additional component - even if Phenopackets would provide the JSON Schema instances for individual schemas like OntologyClass
, in the larger context of GA4GH it would be an advantage to have it in a general {S}[B] collection, together with other schemas.
So for me, ideally, it doesn't matter if you define Phenopackets in Protobuf; but it would be very nice to push individual schemas to {S}[B] for "recycling" (i.e. use in other standards).
Yes, for Search in particular, we need a way to say things like the following:
The mechanism in Search for doing this is to point to a JSON Schema that is the canonical definition of that concept.
To accomplish this, we have been treating https://schemablocks.org/schemas/sb-phenopackets/current/Phenopacket.json
as the canonical identifier for "this is a Phenopacket."
Examples:
https://schemablocks.org/schemas/sb-phenopackets/current/Phenopacket.json
https://schemablocks.org/schemas/sb-phenopackets/v1.0.0/Individual.json#/properties/sex
https://schemablocks.org/schemas/sb-phenopackets/v1.0.0/PhenotypicFeature.json#/properties/ageOfOnset
https://schemablocks.org/schemas/sb-phenopackets/v1.0.0/OntologyClass.json#/properties/id
I think the shortcomings in expressiveness that I called out above are something we will need to figure out in the context of Search. I'm just including them for clarity and completeness.
What we would be looking for from Phenopackets is a way to refer to a Phenopacket and parts thereof unambiguously. Any two sites that expose Phenopackets data via Search should point to this same place. Designating SchemaBlocks as the official home for such concept pointers would certainly be one way to achieve this.
And the {S}[B] schemas point clearly to the donor schema, authors, documentation ... as the authoritative version.
But ideally we'd have a setup in which GA4GH devs support the translation, so that it is not solely left to interaction between {S}[B] "volunteers" and donor schema maintainers.
For the the donor schemas, there are benefits in exposure but also in shifting from "we have this but still work on the docs" to "current stable version in {S}[B], thank you very much indeed".
Might be of interest, leaving here as a 'bookmark' - this is the blog post
https://github.com/confluentinc/schema-registry
and this one
I've had eyes on the confluent schema registry product for a while. I'd love to hear thoughts if anyone does a deep dive!
Protobuf isn't compatible with JSON schema-based projects such as Search/Beacon so they rely on a second-hand Schemablocks representation of some elements.
Ideally pehnopackets should be directly discoverable and useable in JSON schema projects with its own namespace e.g. schema.org findability.
@jfuerth, can you expand on this?