Closed costero-e closed 1 year ago
Thank you @redmitry to spot that! I added another commit to fix the position of the required. Can you review it again and confirm it's correct now? Thanks.
The last piece is to return back:
"required": [
"quantityType",
"quantity"
]
😄
ok, done @redmitry , I am starting to feel the boomerang effect with this PR jaja, can you review it again? thank you!
It looks correct now. There is also an error in the the original CINECA Individuals data.
"measurementValue" : {
"quantity" : {
"unit" : {
"id" : "NCIT:C49671",
"label" : "Kilogram per Square Meter"
},
"value" : 26.63838307
}
}
while correct one must be like:
"measurementValue": {
"unit": {
"id": "NCIT:C49671",
"label": "Kilogram per Square Meter"
},
"value": 26.63838307
}
I have a fixed version: https://gitlab.bsc.es/inb/ga4gh/beacon-v2-docker-demo/-/tree/master/beacon-data
Just a note, if we want to be Phenopacket friendly, I suggest changing our schema to align with theirs. When in trouble, I follow theirs...
See: https://phenopacket-schema.readthedocs.io/en/latest/value.html
and:
https://github.com/EGA-archive/beacon2-ri-tools/issues/3
Cheers,
m
Ok, let's close this PR and discuss the individuals schema question in issues. @mbaudis can you please review and accept this PR? Thanks.
@mrueda
Just a note, if we want to be Phenopacket friendly, I suggest changing our schema to align with theirs. When in trouble, I follow theirs...
Tru dat. But PXF is inconsistent itself, at least when looking at the examples (and hampered by the "let's write this in YAML starting from a protobuf definition" documentation...):
value:
quantity:
unit:
id: "UO:0000316"
label: "cells per microliter"
value: 24000.0
referenceRange:
unit:
id: "UO:0000316"
label: "cells per microliter"
low: 150000.0
high: 450000.0
But example in Measurement:
measurement:
assay:
id: "LOINC:26515-7"
label: "Platelets [#/volume] in Blood"
value:
quantity:
unit:
id: "UO:0000316"
label: "cells per microliter"
value: 24000.0
referenceRange:
unit:
id: "UO:0000316"
label: "cells per microliter"
low: 150000.0
high: 450000.0
... places the referenceRange
wrong.
we need @redmitry approval here to close this PR
yes @redmitry , I still have to take a look carefully at individuals issue before validating it.
@redmitry I just checked the CINETICA individuals schema. Correct me if I'm wrong but I think the correction you made is because the value is not complex? Also, even if the value was complex, quantity should be an array not an object? @mrueda said that phenopackets (https://phenopacket-schema.readthedocs.io/en/latest/covid19-example.html) are written like this:
Here I see the example value is not complex and written with quantity. I can correct the individuals schema but wouldn't be good for beacon to respect the phenopackets schema? I'm not saying I'm not going to do it, I just want to understand know why we are not following the same logic as for phenopackets schema. Thank you @redmitry !
can correct the individuals schema but wouldn't be good for beacon to respect the phenopackets schema? IMHO phenopackets is not a schema, but a format that can be represented in json.
Beacon's schema != phenopackets. We may create an entry type "phenopackets" and provide a schema for it.
Correct me if I'm wrong but I think the correction you made is because the value is not complex? The correction to the CINECA data was done because it didn't comply with beacon v2 schema.
Cheers,
D.
In my view, the difference between Beacon v2 and Phenopackets v2 schemas, specifically in the individuals
entry type, likely stems from the reverse engineering of Phenopackets due to its Protobuff schema. If a JSON schema for Phenopackets v2 was available, this discrepancy wouldn't exist. As a developer working with BFF/PXF, these minor differences in defining identical terms/properties are impractical, especially since Beacon v2 was derived from Phenopackets and the goal is future interoperability.
The ideal scenario would be Phenopackets folks serializing their schema to JSON (if anybody from Phenopackets is reading this, please listen to us!! :smile:). However, if this isn't feasible, I propose forming a small working group of two/three people (@mbaudis) to identify and gradually address these differences. I would be eager to participate, given the impact on my work.
Cheers,
m
Ok, then at the moment we will stick to specifications and give the CINECA dataset the schema suggested by @redmitry for measurementValues until we find a solution for the adaptability to Phenopackets.
@costero-e, IMO the key consideration here is whether the exclusion of the hierarchy 'quantity' from the Phenopacket schema was intentional or simply due to the challenges associated with reverse engineering the Protobuf schema. It should be noted that as of 2021, there were not many Phenopacket schemas available for 'Value'.
https://phenopacket-schema.readthedocs.io/en/master/value.html
If the downstream of the object 'Value' in Beacon v2 is identical to that in Phenopackets, it would be more reasonable to maintain consistency upstream as well. Instead of attempting to modify all replicas of CINECA's 'individuals.json', which is likely one of the few, if not the only, real examples that people have of Beacon v2 Models 😅 , it would be preferable to align the schema upstream and move on. What do you think @mbaudis?
Cheers,
Manu
Ok, then at the moment we will stick to specifications and give the CINECA dataset the schema suggested by @redmitry for measurementValues until we find a solution for the adaptability to Phenopackets.
The problem is that "phenopackets-schema" is not a JSON schema. Phenopackets lib provides json serialization/deserialization, but not with defined (via schema) format.
The question is if it is feasible to develop "phenopackets-like" json schema, because in this case phenopackets would need to provide serializer/deserializer for it.
The choices are:
Cheers,
Dmitry
Hi @mrueda and @redmitry. I agree with both of you and as we need to find a solution for CINECA individuals schema respecting both specifications and future phenopackets interoperability I would propose two solutions (if I am wrong or is not feasible what I propose, correct me, please):
This is the fix for the issue opened by @redmitry here #87 . Please @mbaudis and @redmitry review it to move this forward. Thank you.