ga4gh-beacon / beacon-v2-Models

Models that leverage the Beacon Framework v2
Apache License 2.0
4 stars 7 forks source link

Phenopackets differences & alignment #68

Closed mbaudis closed 2 years ago

mbaudis commented 2 years ago

In the current data models, many schemas are either directly compatible to Phenopackets v2 building blocks or at least reflect them in spirit. The following lists elements with notes regarding their Phenopackets compatibility.

While the Beacon v2 default model's schemas do not per se have to reflect PXF schemas, we target an as-close-as-possible alignment to promote/leverage GA4GH-wide standardization.

This issue here serves for a general review regarding current and possible future alignment, with individual changes being processed in separate PRs/issues.

Top-level differences

The Phenopackets model is centered around the Phenopacket, which is the collector and integrator of all sub-schemas (with the addition of the external Family and Cohort schemas). While Phenopacket usually describes information related to a subject - which is defined in an Individual - and the top level elements in Phenopacket relate to a specific proband (measurements as "Measurements performed in the proband"), the phenopacket itself does not explicitely represent an individual.

In contrast, the Beacon v2 default model uses a hierarchy in which biosamples reference individuals directly (if existing). For most purposes one can equate Beacon's Individual with a merge of Phenopacket's core Phenopacket and Individual parameters.

Beacon v2 == PXF v2

Age

AgeRange

Evidence

KaryotypicSex

ReferenceRange

While unit in Beacon points to a Unit definition, this is itself an OntologyTerm i.e. structurally the same.

Value

Beacon v2 =~ PXF v2 (e.g. renamed or additional parameters)

ComplexValue

Renamed ComplexValue.TypedQuantity.quantityType compared to GA4GH Phenopackets v2 ComplexValue.TypedQuantity.type due to problematic use of type as parameter

ExternalReference

Renamed ExternalReference.notes compared to GA4GH Phenopackets v2 ExternalReference.description due to problematic use of description as parameter

Measurement

Added notes and date.

PhenotypicFeature

Beacon Phenopackets
featureType type
severityLevel (re-used definition reflecting an ontology term) severity (ontology class)
notes

Procedure

Beacon Phenopackets
procedureCode code
ageAtProcedure (TimeElement) performed (TimeElement)
dateOfProcedure (ISO date)

TimeElement

The specific parameters have been aligned w/ some differences in naming or use of general parameters.

Beacon Phenopackets
ageGroup ontology_class
age age (Age)
ageRange age_range (AgeRange)
gestationalAge gestational_age (GestationalAge)
timestamp (TimeStamp)
interval (TimeInterval)

Treatment

Beacon still has an ageOfOnset parameter (?).

Beacon v2 ~ PXF v2 (e.g. multiple/complex differences)

Disease

Pedigree

While the Beacon & Phenopackets schemas for "pedigree" representation are not aligned, they may become superseded by the GA4GH pedigree standard currenty under development.

Sex

Beacon directly uses the (IMO preferable) representation through an ontology term, while PXF uses an ordinal mapping

2022-03-17: Updated for last adjustments (KaryotypicSex, Treatment ...)
2022-01-18: Updated w/ "Top-level differences"
julesjacobsen commented 2 years ago

Are the Treatment differences still correct? These look to be the same now e.g. they both use an array of DoseInterval.

mbaudis commented 2 years ago

@julesjacobsen Yes, the notes here were the initial comparison. While I've added some changes after adjustments we'll document the relation to Phenopackets in the documentation (https://github.com/ga4gh-beacon/beacon-v2-unity-testing/blob/main/docs/formats-standards.md or page linked from there). (@mrueda @laurenfromont)

mbaudis commented 2 years ago

The comparison has been added to the documentation: http://docs.genomebeacons.org/formats-standards/#phenopackets.

We'll update there and through individual issues/PRs.